Download 2008 Spring Biological database Homework 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
2008 Spring Biological database Homework 1
This problem set is due by 2PM, March 25, 2008. You shall upload your answers to your
web site as instructed by your TA. For all questions, please make a reference such as
screen-shot to indicate the source of your answer.
1. Here is a nucleotide sequence:
CTCCAGGCCCGTGGGGCTGGCCCTGCACCGCCGAGCTTCCCGGGATGAGGGCCCCCGGTGTGGTCACCCG
GCGCGCCCCAGGTCGCTGAGGGACCCCGGCCAGGCGCGGAGATGGGGGTGCACGAATGTCCTGCCTGGCT
GTGGCTTCTCCTGTCCCTGCTGTCGCTCCCTCTGGGCCTCCCAGTCCTGGGCGCCCCACCACGCCTCATC
TGTGACAGCCGAGTCCTGGAGAGGTACCTCTTGGAGGCCAAGGAGGCCGAGAATATCACGACGGGCTGTG
CTGAACACTGCAGCTTGAATGAGAATATCACTGTCCCAGACACCAAAGTTAATTTCTATGCCTGGAAGAG
GATGGAGGTCGGGCAGCAGGCCGTAGAAGTCTGGCAGGGCCTGGCCCTGCTGTCGGAAGCTGTCCTGCGG
GGCCAGGCCCTGTTGGTCAACTCTTCCCAGCCGTGGGAGCCCCTGCAGCTGCATGTGGATAAAGCCGTCA
GTGGCCTTCGCAGCCTCACCACTCTGCTTCGGGCTCTGGGAGCCCAGAAGGAAGCCATCTCCCCTCCAGA
TGCGGCCTCAGCTGCTCCACTCCGAACAATCACTGCTGACACTTTCCGCAAACTCTTCCGAGTCTACTCC
AATTTCCTCCGGGGAAAGCTGAAGCTGTACACAGGGGAGGCCTGCAGGACAGGGGACAGATGACCAGGTG
TGTCCACCTGGGCATATCCACCACCTCCCTCACCAACATTGCTTGTGCCACACCCTCCCCCGCCACTCCT
GAACCCCGTCGAGGGGCTCTCAGCTCAGCGCCAGCCTGTCCCATGGACACTCCAGTGCCAGCAATGACAT
CTCAGGGGCCAGAGGAACTGTCCAGAGAGCAACTCTGAGATCTAAGGATGTCACAGGGCCAACTTGAGGG
CCCAGAGCAGGAAGCATTCAGAGAGCAGCTTTAAACTCAGGGACAGAGCCATGCTGGGAAGACGCCTGAG
CTCACTCGGCACCCTGCAAAATTTGATGCCAGGACACGCTTTGGAGGCGATTTACCTGTTTTCGCACCTA
CCATCAGGGACAGGATGACCTGGAGAACTTAGGTGGCAAGCTGTGACTTCTCCAGGTCTCACGGGCATGG
Please use database mining tools of your choice to tell me as much as you can
about this sequence.
What gene does this sequence represent in human? (Erythropoietin) What is its GI number?
(GeneID: 2056) GenBank Accession number?(NM_000799) Gene symbol? (EPO)
Unigene ID? (UGID:131206 UniGene Hs.2303)
From this website, we know that it is erythropoietin gene.
GeneID: 2056
Gene symbol: EPO
ACCESSION
NM_000799
UGID:131206
UniGene Hs.2303
i.
What database(s) did you search, and what tool(s) did you use in your search?
What parameter settings did you use?
Blast, Unigene, GenBank, CoreNucleotide, Google
ii.
Retrieve one ortholog of this gene’s complete mRNA sequence and Protein
sequence in FASTA format. Compare the results obtained by blastn vs.
blastp.
mRNA sequence sequence in FASTA format :
>gi|62240996|ref|NM_000799.2| Homo sapiens erythropoietin (EPO), mRNA
CCCGGAGCCGGACCGGGGCCACCGCGCCCGCTCTGCTCCGACACCGCGCCCCCTGGACAGCCGCCCTCTC
CTCCAGGCCCGTGGGGCTGGCCCTGCACCGCCGAGCTTCCCGGGATGAGGGCCCCCGGTGTGGTCACCCG
GCGCGCCCCAGGTCGCTGAGGGACCCCGGCCAGGCGCGGAGATGGGGGTGCACGAATGTCCTGCCTGGCT
GTGGCTTCTCCTGTCCCTGCTGTCGCTCCCTCTGGGCCTCCCAGTCCTGGGCGCCCCACCACGCCTCATC
TGTGACAGCCGAGTCCTGGAGAGGTACCTCTTGGAGGCCAAGGAGGCCGAGAATATCACGACGGGCTGTG
CTGAACACTGCAGCTTGAATGAGAATATCACTGTCCCAGACACCAAAGTTAATTTCTATGCCTGGAAGAG
GATGGAGGTCGGGCAGCAGGCCGTAGAAGTCTGGCAGGGCCTGGCCCTGCTGTCGGAAGCTGTCCTGCGG
GGCCAGGCCCTGTTGGTCAACTCTTCCCAGCCGTGGGAGCCCCTGCAGCTGCATGTGGATAAAGCCGTCA
GTGGCCTTCGCAGCCTCACCACTCTGCTTCGGGCTCTGGGAGCCCAGAAGGAAGCCATCTCCCCTCCAGA
TGCGGCCTCAGCTGCTCCACTCCGAACAATCACTGCTGACACTTTCCGCAAACTCTTCCGAGTCTACTCC
AATTTCCTCCGGGGAAAGCTGAAGCTGTACACAGGGGAGGCCTGCAGGACAGGGGACAGATGACCAGGTG
TGTCCACCTGGGCATATCCACCACCTCCCTCACCAACATTGCTTGTGCCACACCCTCCCCCGCCACTCCT
GAACCCCGTCGAGGGGCTCTCAGCTCAGCGCCAGCCTGTCCCATGGACACTCCAGTGCCAGCAATGACAT
CTCAGGGGCCAGAGGAACTGTCCAGAGAGCAACTCTGAGATCTAAGGATGTCACAGGGCCAACTTGAGGG
CCCAGAGCAGGAAGCATTCAGAGAGCAGCTTTAAACTCAGGGACAGAGCCATGCTGGGAAGACGCCTGAG
CTCACTCGGCACCCTGCAAAATTTGATGCCAGGACACGCTTTGGAGGCGATTTACCTGTTTTCGCACCTA
CCATCAGGGACAGGATGACCTGGAGAACTTAGGTGGCAAGCTGTGACTTCTCCAGGTCTCACGGGCATGG
GCACTCCCTTGGTGGCAAGAGCCCCCTTGACACCGGGGTGGTGGGAACCATGAAGACAGGATGGGGGCTG
GCCTCTGGCTCTCATGGGGTCCAAGTTTTGTGTATTCTTCAACCTCATTGACAAGAACTGAAACCACCAA
AAAAAAAAAA
Protein sequence in FASTA format:
>gi|62240997|ref|NP_000790.2| erythropoietin precursor [Homo sapiens]
MGVHECPAWLWLLLSLLSLPLGLPVLGAPPRLICDSRVLERYLLEAKEAENITTGCAEHCSLNENITVPD
TKVNFYAWKRMEVGQQAVEVWQGLALLSEAVLRGQALLVNSSQPWEPLQLHVDKAVSGLRSLTTLLRALG
AQKEAISPPDAASAAPLRTITADTFRKLFRVYSNFLRGKLKLYTGEACRTGDR
Blastn:Nucleotide blast
Blastp:Protein Blast
iii.
Retrieve at least 5 homologenes of this gene. Perform a multiple sequence
alignment? The human sequence is most similar to what organism?
The human sequence is most similar to chimpanzee because of their similarity is
99.48(n) and 99.48(a).
iv.
Is the secondary structure of this protein known? If so, how many “helical
fold”are there in its 3D protein structure?(4 helical fold) How did you
determine the exact amino acid number of each helical region?(18,28,20,24)
the exact amino acid number of each helical region?(18,28,20,24 amino acid)
v.
Is the function of this protein known? If so, what does it do?
EPO
It is used in treating anemia resulting from chronic kidney disease, from the treatment of
cancer (chemotherapy & radiation) and from other critical illnesses (heart failure).
Erythropoietin is available as a therapeutic agent produced by recombinant DNA
technology in mammalian cell culture..
vi.
Which normal human tissues is this gene mainly expressed in? How did you
determine this?
From expression profile below, this gene mainly expressed in eye and prostate.
vii.
Is this protein involved in any biological pathway(s)? If so, what does the
pathway do?
Erythropoiesis is the process by which red blood cells (erythrocytes) are produced. In
human adults, this usually occurs within the bone marrow.(Although in humans with
certain diseases and in some animals, erythropoeiesis also occurs outside the bone
marrow, within the spleen or liver, this is termed extramedullary erythropoiesis.)
viii.
Do any other databases contain information about the superfamily of this
target gene product? Which superfamily? How did you find out?
GeneCards databases contain information about the superfamily of this gene.
ix.
Look for publications relevant to the function(s) of this protein in the
biomedical literature. Show one abstract of a relevant article.
TOPIC: Stat5 activation enables erythropoiesis in the absence of EpoR and Jak2.
x.
Show the protein 3-D structure if there is any.
1. Find the zebra fish homolog of the above gene. And answer the following
questions:
i.
The zebra fish homolog is located on which chromosome? And in Human?
Human chromosome: 7;
Location:7q22
Zebra fish chromosome: 1
Perform a cDNA and Polypeptide sequence alignment between human and zebra fish
of this gene.
cDNA sequence alignment
Polypeptide sequence alignment
ii.
How many exons does this gene have in zebrafish? How did you determine
this?
Exons: 5 Transcript length: 1,825 bps
iii.
Translation length: 182 residues
What is the expression pattern of this gene in zebrafish? In human? In mouse?
zebrafish
Human
mouse?
Related documents