Download Exercise 3: BLAST Database Searches and Pairwise alignments

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Exercise 3: BLAST Database Searches and Pairwise alignments
Name:
Due date: Friday, May 27th at 4:00 pm
Purpose:
To become familiar with the different programs available at NCBI Blast.
To learn how to identify possible homologues using sequence similarity.
To understand limitations of BLAST searches.
Remember: you should have a narrative for each section of the exercise.
Background:
Sasquatch genome project: http://www.sasquatchgenomeproject.org/
Lecture notes on sequence similarity and alignments
BLAST tutorial
Activities:
Database searches using blastn, blastp and primer-blast.
Exercises:
3-1) Using BLASTN to identify a sequence of unknown origin
In November of 2014, DNA Diagnostics Inc., a forensics DNA company in Texas, issued
a press release claiming to have sequenced DNA from Bigfoot (aka “Sasquatch”). Sasquatch or
Bigfoot is an ape-like creature that supposedly inhabits the forests of northwestern United
States. Download the DNA sample and using NCBI BLAST, answer the following questions as
part of your write-up:
1) What databases did you use for the search(es)?
2) What was the top match in the database for each query sequence?
3) What is the most likely animal (taxonomic) source for this sequence, including
common name if available?
4) Does this sequence encode a protein? If so, which one?
As part of your report, you should include the query, query length and databases searched,
including limits. In describing the top matches, include the accession number, short
description, E-value, percent identity and percent query coverage to support your
conclusions.
3-2) Using BLASTP to determine taxonomic distribution of a protein sequence.
Find the gene that was the most DOWN-regulated in your list of p38MAPK-dependent
Senescent/Young genes. Your goal with this part of the exercise is to determine if there is a
homolog of this protein in either a fungal or a bacterial species.
BCHM 6280 2016
Exercise 3
Page 1 of 3
Prepare a table that includes the following information:
 Gene ID
 HGNC symbol
 Log2 (Late/Senescent)
 Refseq protein accession (NP_XXXXXX)
 query length
 BLAST program used
 database searched (including limits)
 sequence ID and species of top match
 E-value & alignment length
 percent identity & percent similarity.
Questions that should be considered as part of the narrative:
1) Based on a pairwise comparison of the top match in each taxonomic group, would you
consider them to be a homolog? That is, do they appear to have the same biochemical
function?
2) If the human protein has known protein domains, do the top matches in the other
taxonomic groups have the same domains?
3-3) Identifying primer locations using Primer-BLAST
A colleague sends you primers that he says will amplify a specific alternative transcript
of the human ACPP gene, which is highly UP-regulated in your list of genes. He says the
primers will specifically amplify transcript isoform 2. This particular transcript isoform
produces a transmembrane form of this protein versus the secreted form produced by the
predominant transcript isoform 1. He says the PCR product will be ~550 bp in size. From past
experience working with your colleague, you want to confirm that these primers:
a. Will bind only to the human ACPP transcript isoform 2 and give the product size
expected
b. Will not bind to the ACPP transcript isoform 1
c. Will not bind anywhere else in the human genome.
You should include in your narrative or report the Refseq mRNA accession numbers of the
reference transcripts for ACPP transcript isoforms 1 and 2 and their length.
Questions that should be considered as part of the narrative:
1. Describe what tools and websites you used to answer these questions.
2. Where in the human transcript do these primers bind?
3. What is the predicted size of the PCR product for the transcript?
4. Will they bind to any other human genes or transcripts?
5. If you used these same primers in a PCR reaction with genomic DNA, what is the
expected size?
6. What do the relative sizes of the transcript and genomic products tell you about the
where in the gene these primers bind?
7. Will these primers work with the mouse homolog of ACPP?
Forward primer: TATCCACATTCGCCGTGGAC
Reverse primer: CGGACAACTGTGGCAGAGAA
BCHM 6280 2016
Exercise 3
Page 2 of 3
Extra credit (15 points):
Design a single pair primers to distinguish transcript isoforms
You can use the MAPK14 gene or any other gene from your list that contains 2 or more
transcript isoforms that result from alternative splicing. Your goal is to design a pair of PCR
primers that will distinguish the transcript isoforms in a PCR assay.
For the report provide:
a. The Ensembl transcript accession numbers of the two transcript isoforms
b. Size of the transcripts of the two isoforms
c. Description of how the two transcript isoforms differ
d. Sequences of the primer pair, where they bind and expected product size for the
two isoforms
Questions to answer as part of a narrative:
1. Describe what web-based tools and databases you used to design these PCR primers.
2. Are the product sizes expected for the primers used with the two transcript isoforms
sufficiently different that you are confident that you can distinguish them on a gel?
3. What is the product size of the two primers if you used them with genomic DNA?
BCHM 6280 2016
Exercise 3
Page 3 of 3