* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download bchm6280_16_ex1
Human genetic variation wikipedia , lookup
Zinc finger nuclease wikipedia , lookup
Non-coding DNA wikipedia , lookup
Transposable element wikipedia , lookup
Biology and consumer behaviour wikipedia , lookup
Oncogenomics wikipedia , lookup
Protein moonlighting wikipedia , lookup
Genomic library wikipedia , lookup
Long non-coding RNA wikipedia , lookup
Epigenetics of neurodegenerative diseases wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Genomic imprinting wikipedia , lookup
Human genome wikipedia , lookup
Point mutation wikipedia , lookup
History of genetic engineering wikipedia , lookup
Genetic engineering wikipedia , lookup
Saethre–Chotzen syndrome wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Copy-number variation wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Public health genomics wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
The Selfish Gene wikipedia , lookup
Pathogenomics wikipedia , lookup
Gene expression programming wikipedia , lookup
Genome (book) wikipedia , lookup
Gene expression profiling wikipedia , lookup
Gene therapy wikipedia , lookup
Genome editing wikipedia , lookup
Genome evolution wikipedia , lookup
Gene desert wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Microevolution wikipedia , lookup
Helitron (biology) wikipedia , lookup
Gene nomenclature wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
BCHM 6280_Exercise 1: Gene specific information using NCBI, Ensembl and UCSC Genome viewer Due date: Friday, May 20th at 4:00 pm (email to [email protected]) Please name the file LastName_ex1.doc(x) or LastName_ex1.pdf Web resources: NCBI database: http://www.ncbi.nlm.nih.gov/ Ensembl database: http://useast.ensembl.org/index.html UCSC Genome browser: http://genome.ucsc.edu/ Exercise 1 homepage: http://biochem.slu.edu/bchm628/exercise1.html Background: Reference: Alspach E, Flanagan KC, Luo X, Ruhland MK, Huang H, Pazolli E, Donlin MJ, Marsh T, PiwnicaWorms D, Monahan J, Novack DV, McAllister SS, Stewart SA. “p38MAPK plays a crucial role in stromalmediated tumorigenesis.” Cancer Discov. 2014 Jun;4(6):716-29. PMID: 24670723; PMCID: PMC4049323. The data we will analyze in this course came from the above reference and the study was conducted done in the lab of Sheila Stewart at Washington University. I’ll go over how I reanalyzed the data for this course during the lecture/lab on Thursday May 19th. For this exercise, we will use the gene encoding a kinase they call p38MAPK for learning how to find information about specific genes using various biomolecular databases. 1-1: Conduct text based searches of NCBI and Ensembl The authors of this paper use the protein name p38MAPK to describe the kinase that is the target of a small molecule inhibitor, SB203580. Your task for this part of the exercise is to find the specific NCBI/Gene and Ensembl records for the gene encoding this protein. a) Search the NCBI Gene database using the query term: “p38MAPK AND human”. b) Change the search query to: “p38MAPK AND human[Organism]” or use the Advance option to create the same query. c) Search the Ensembl database for the human gene encoding p38MAPK. Change the dropdown menu to human, type p38MAPK in the search box and click GO. The correct gene symbol is MAPK14. You will find that as one of the records further down the list. Based on this information, describe where in the search results from a), b) and c) that you found this gene. Use the following questions to Questions to address in the narrative: 1) How far down the results page was MAP14 located? 2) Is this an effective method for finding the Gene record of a specific gene? 3) How might you make this search more efficient? 4) Does searching PubMed help you find the correct gene name/symbol faster? BCHM 6280 2016 Exercise 1 Page 1 of 2 1-2: Finding transcript information about a specific gene using NCBI & Ensembl a) Within the NCBI gene record for the MAPK14 gene there are 2 sections that provide transcript/protein information: Genomic regions, transcripts and products and NCBI Reference Set. Export a PDF from the Genomic regions section. Here, genes are colorcoded (green for protein coding, blue for non-coding). It also lists gene models (XR or XM). Using this information, answer the following questions in the form of a table that lists the accession numbers for the coding, non-coding, model and reference transcript/proteins. Attach the PDF you downloaded from this section. 1. How many Refseq protein-coding transcripts (with prefex NM) are listed for the MAPK14 gene? 2. How many Refseq protein-coding transcripts (with prefix XM) are listed for the MAP14 gene? 3. How many non-protein coding transcripts are listed for this gene? b) Within the Ensembl gene record for MAPK14, find the transcript table. Export it to CSV format and import into Excel. Based on the data for the Ensembl MAPK gene record, answer the following questions: 1. Provide the definitions for TSL:1 and TSL:5. 2. How many Ensembl transcripts are protein coding? Non-coding? 3. How many Ensembl transcripts have a RefSeq counterpart? 1-3 Accessing genomic information about a specific gene a) Search the GRCh38/hg38 human genome assembly at the UCSC genome browser for the MAPK14 gene. Using the information provided and the track controls, create a table for the chromosomal context of the MAPK14 gene with the following information: 1. Chromosome number coordinates and strand for this gene location. 2. How many common (v146) SNPs found in the genomic region of the MAPK14 gene occur in untranslated regions or are non-synonymous coding variants? 3. How many flagged (v146) SNPs are found in this genomic region? Configure the images as follows: Display only Refseq genes Turn off Repeats and Conservation tracks Display commons SNPs (146) in squished mode Increase font size to 10 pt and window to 1000 pixels Export as a PDF and attach to your homework. b) Search the Ensembl site for the human MAPK14 gene, then click on the location tab to view the genome browser. It should show essentially the same coordinates the UCSC browser. If not, then zoom in or out or type in the correct coordinates. Based on the data in the MultiCell Regulatory features track, answer the following questions: 1. How many promoters are found in the genomic region of the MAPK14 gene? 2. How many CTCF binding sites are found in this genomic region? 3. How many enhancer binding sites? Export the browser image in PDF format and attach to your homework. BCHM 6280 2016 Exercise 1 Page 2 of 2