Download 2008 Spring Biological database Homework 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
2008 Spring Biological database Homework 1
This problem set is due by 2PM, March 25, 2008. You shall upload your answers to your
web site as instructed by your TA. For all questions, please make a reference such as
screen-shot to indicate the source of your answer.
1. Here is a nucleotide sequence:
CTCCAGGCCCGTGGGGCTGGCCCTGCACCGCCGAGCTTCCCGGGATGAGGGCCCCCGGTGTGGTCACCCG
GCGCGCCCCAGGTCGCTGAGGGACCCCGGCCAGGCGCGGAGATGGGGGTGCACGAATGTCCTGCCTGGCT
GTGGCTTCTCCTGTCCCTGCTGTCGCTCCCTCTGGGCCTCCCAGTCCTGGGCGCCCCACCACGCCTCATC
TGTGACAGCCGAGTCCTGGAGAGGTACCTCTTGGAGGCCAAGGAGGCCGAGAATATCACGACGGGCTGTG
CTGAACACTGCAGCTTGAATGAGAATATCACTGTCCCAGACACCAAAGTTAATTTCTATGCCTGGAAGAG
GATGGAGGTCGGGCAGCAGGCCGTAGAAGTCTGGCAGGGCCTGGCCCTGCTGTCGGAAGCTGTCCTGCGG
GGCCAGGCCCTGTTGGTCAACTCTTCCCAGCCGTGGGAGCCCCTGCAGCTGCATGTGGATAAAGCCGTCA
GTGGCCTTCGCAGCCTCACCACTCTGCTTCGGGCTCTGGGAGCCCAGAAGGAAGCCATCTCCCCTCCAGA
TGCGGCCTCAGCTGCTCCACTCCGAACAATCACTGCTGACACTTTCCGCAAACTCTTCCGAGTCTACTCC
AATTTCCTCCGGGGAAAGCTGAAGCTGTACACAGGGGAGGCCTGCAGGACAGGGGACAGATGACCAGGTG
TGTCCACCTGGGCATATCCACCACCTCCCTCACCAACATTGCTTGTGCCACACCCTCCCCCGCCACTCCT
GAACCCCGTCGAGGGGCTCTCAGCTCAGCGCCAGCCTGTCCCATGGACACTCCAGTGCCAGCAATGACAT
CTCAGGGGCCAGAGGAACTGTCCAGAGAGCAACTCTGAGATCTAAGGATGTCACAGGGCCAACTTGAGGG
CCCAGAGCAGGAAGCATTCAGAGAGCAGCTTTAAACTCAGGGACAGAGCCATGCTGGGAAGACGCCTGAG
CTCACTCGGCACCCTGCAAAATTTGATGCCAGGACACGCTTTGGAGGCGATTTACCTGTTTTCGCACCTA
CCATCAGGGACAGGATGACCTGGAGAACTTAGGTGGCAAGCTGTGACTTCTCCAGGTCTCACGGGCATGG
Please use database mining tools of your choice to tell me as much as you can
about this sequence.
i.
What gene does this sequence represent in human? What is its GI number?
GenBank Accession number? Gene symbol? Unigene ID?
This gene represents erythropoietin in human.
GenBank Accession number: NM_000799
From this website, the following information are shown:
Gene symbol: EPO
Unigene ID: Hs. 2303
ii.
What database(s) did you search, and what tool(s) did you use in your search?
What parameter settings did you use?
I searched NCBI database. I used NCBI BLAST as a tool in my search.
And I used the key word“epo”to do the search in NCBI database.
iii.
Retrieve one ortholog of this gene’s complete mRNA sequence and Protein
sequence in FASTA format. Compare the results obtained by blastn vs.
blastp.
>gi|54792749|ref|NM_001006646.1| Canis lupus familiaris erythropoietin (EPO), mRNA
ATGTGTGAACCTGCCCCTCCAAAACCCACACAGTCAGCCTGGCACTCTTTTCCAGAATGTCCTGCCCTGC
TCCTTTTGCTGTCTTTGCTGCTGCTTCCTCTGGGCCTCCCAGTCCTGGGCGCCCCCCCTCGCCTCATTTG
TGACAGCCGGGTCCTGGAGAGATACATCCTGGAGGCCAGGGAGGCCGAAAATGTCACGATGGGCTGTGCT
CAAGGCTGCAGCTTCAGTGAGAATATCACCGTCCCAGACACCAAGGTTAATTTCTATACCTGGAAGAGGA
TGGATGTTGGGCAGCAGGCCTTGGAAGTCTGGCAGGGCCTGGCACTGCTCTCAGAAGCCATCCTGCGGGG
TCAGGCCCTGTTGGCCAACGCCTCCCAGCCATCTGAGACTCCGCAGCTGCATGTGGACAAAGCCGTCAGC
AGCCTGCGCAGCCTCACCTCTCTGCTTCGGGCGCTGGGAGCCCAGAAGGAGGCCATGTCCCTTCCAGAGG
AAGCCTCTCCTGCTCCACTCCGAACATTCACTGTTGATACTTTGTGCAAACTTTTCCGAATCTACTCCAA
TTTCCTCCGTGGAAAGCTGACACTGTACACAGGGGAGGCCTGCAGAAGAGGAGACAGGTGACCAGGTGCT
CCCACCCCAGGCACATCCACCACCTCACTCACTACCACTGCCTGGGCCACGCCTCTGCACCACCACTCCT
GACCCCTGTCCAGGGGTGATCTGCTCAGCACCAGCCTGTCCCTGTCCCTTGGACACTCCACGGCCAGTGG
TGATATCTCAAGGGCCAGAGGAACTGTCCAGAGCTCAAATCAGATCTAAGGATGTCACAGTGCCAGCCTG
AGGCCCGAAGCAGGAGGAATTCGGAGGAAATCAGCTCAAACTTGGGGACAGAGCCTTGCTCGGGAGACTC
ACCTCGGTGCCCTGCCGAACAGTGATGCCAGGACAAGCTGGAGGGCAATTGCCGATTTTTTGCACCTATC
AGGGAGAGACAGGAGAGGCTAGAGAACTAGGTGGCAAGCCATAAATCTTTTAGGCTTCGGGTCTCCTATG
ACAGCAAGAGCCCACTGGCAAAGGGGGGGGAGCCATGGAGATGGGATAGGGGCTGGCCCAAAAAAAAAAA
AA
>gi|54792750|ref|NP_001006647.1| erythropoietin [Canis lupus familiaris]
MCEPAPPKPTQSAWHSFPECPALLLLLSLLLLPLGLPVLGAPPRLICDSRVLERYILEAREAENVTMGCA
QGCSFSENITVPDTKVNFYTWKRMDVGQQALEVWQGLALLSEAILRGQALLANASQPSETPQLHVDKAVS
SLRSLTSLLRALGAQKEAMSLPEEASPAPLRTFTVDTLCKLFRIYSNFLRGKLTLYTGEACRRGDR
Comparison using blastn:
Comparison using blastp:
iv.
Retrieve at least 5 homologenes of this gene. Perform a multiple sequence
alignment? The human sequence is most similar to what organism?
Result of multiple sequence alignment:
Human sequence is most similar to that of Pan troglodytes.
v.
Is the secondary structure of this protein knowetweenn? If so, how many
“helical fold”are there in its 3D protein structure? How did you determine
the exact amino acid number of each helical region?
Yes.
4 helical folds are in the 3D protein structure.
Sequence details provided by PDB website shows the sequences of the helical
structures.
vi.
Is the function of this protein known? If so, what does it do?
The functions of the protein are listed below:
vii.
Which normal human tissues is this gene mainly expressed in? How did you
determine this?
This gene mainly expresses in eye and prostate.
viii.
Is this protein involved in any biological pathway(s)? If so, what does the
pathway do?
EPO is involved in three pathways as described in the website:
(1) Cytokine-cytokine receptor interaction
(2) Hematopoietic cell lineage
(3) Jak-STAT signaling pathway
ix.
Do any other databases contain information about the superfamily of this
target gene product? Which superfamily? How did you find out?
x.
Look for publications relevant to the function(s) of this protein in the
biomedical literature. Show one abstract of a relevant article.
Below is the abstract of the relevant article:
xi.
Show the protein 3-D structure if there is any.
1. Find the zebra fish homolog of the above gene. And answer the following
questions:
i.
The zebra fish homolog is located on which chromosome? And in Human?
The gene is located on chromosome 7 in zebra fish.
In human, the gene is also located on chromosome 7.
ii.
Perform a cDNA and Polypeptide sequence alignment between human and
zebra fish of this gene.
Blastn:
Blastp:
iii.
How many exons does this gene have in zebrafish? How did you determine
this?
There are 5 exons within the gene sequence as described in the website.
iv.
What is the expression pattern of this gene in zebrafish? In human? In mouse?
Related documents