Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
2008 Spring Biological database Homework 1 This problem set is due by 2PM, March 25, 2008. You shall upload your answers to your web site as instructed by your TA. For all questions, please make a reference such as screen-shot to indicate the source of your answer. 1. Here is a nucleotide sequence: CTCCAGGCCCGTGGGGCTGGCCCTGCACCGCCGAGCTTCCCGGGATGAGGGCCCCCGGTGTGGTCACCCG GCGCGCCCCAGGTCGCTGAGGGACCCCGGCCAGGCGCGGAGATGGGGGTGCACGAATGTCCTGCCTGGCT GTGGCTTCTCCTGTCCCTGCTGTCGCTCCCTCTGGGCCTCCCAGTCCTGGGCGCCCCACCACGCCTCATC TGTGACAGCCGAGTCCTGGAGAGGTACCTCTTGGAGGCCAAGGAGGCCGAGAATATCACGACGGGCTGTG CTGAACACTGCAGCTTGAATGAGAATATCACTGTCCCAGACACCAAAGTTAATTTCTATGCCTGGAAGAG GATGGAGGTCGGGCAGCAGGCCGTAGAAGTCTGGCAGGGCCTGGCCCTGCTGTCGGAAGCTGTCCTGCGG GGCCAGGCCCTGTTGGTCAACTCTTCCCAGCCGTGGGAGCCCCTGCAGCTGCATGTGGATAAAGCCGTCA GTGGCCTTCGCAGCCTCACCACTCTGCTTCGGGCTCTGGGAGCCCAGAAGGAAGCCATCTCCCCTCCAGA TGCGGCCTCAGCTGCTCCACTCCGAACAATCACTGCTGACACTTTCCGCAAACTCTTCCGAGTCTACTCC AATTTCCTCCGGGGAAAGCTGAAGCTGTACACAGGGGAGGCCTGCAGGACAGGGGACAGATGACCAGGTG TGTCCACCTGGGCATATCCACCACCTCCCTCACCAACATTGCTTGTGCCACACCCTCCCCCGCCACTCCT GAACCCCGTCGAGGGGCTCTCAGCTCAGCGCCAGCCTGTCCCATGGACACTCCAGTGCCAGCAATGACAT CTCAGGGGCCAGAGGAACTGTCCAGAGAGCAACTCTGAGATCTAAGGATGTCACAGGGCCAACTTGAGGG CCCAGAGCAGGAAGCATTCAGAGAGCAGCTTTAAACTCAGGGACAGAGCCATGCTGGGAAGACGCCTGAG CTCACTCGGCACCCTGCAAAATTTGATGCCAGGACACGCTTTGGAGGCGATTTACCTGTTTTCGCACCTA CCATCAGGGACAGGATGACCTGGAGAACTTAGGTGGCAAGCTGTGACTTCTCCAGGTCTCACGGGCATGG Please use database mining tools of your choice to tell me as much as you can about this sequence. i. What gene does this sequence represent in human? What is its GI number? GenBank Accession number? Gene symbol? Unigene ID? This gene represents erythropoietin in human. GenBank Accession number: NM_000799 From this website, the following information are shown: Gene symbol: EPO Unigene ID: Hs. 2303 ii. What database(s) did you search, and what tool(s) did you use in your search? What parameter settings did you use? I searched NCBI database. I used NCBI BLAST as a tool in my search. And I used the key word“epo”to do the search in NCBI database. iii. Retrieve one ortholog of this gene’s complete mRNA sequence and Protein sequence in FASTA format. Compare the results obtained by blastn vs. blastp. >gi|54792749|ref|NM_001006646.1| Canis lupus familiaris erythropoietin (EPO), mRNA ATGTGTGAACCTGCCCCTCCAAAACCCACACAGTCAGCCTGGCACTCTTTTCCAGAATGTCCTGCCCTGC TCCTTTTGCTGTCTTTGCTGCTGCTTCCTCTGGGCCTCCCAGTCCTGGGCGCCCCCCCTCGCCTCATTTG TGACAGCCGGGTCCTGGAGAGATACATCCTGGAGGCCAGGGAGGCCGAAAATGTCACGATGGGCTGTGCT CAAGGCTGCAGCTTCAGTGAGAATATCACCGTCCCAGACACCAAGGTTAATTTCTATACCTGGAAGAGGA TGGATGTTGGGCAGCAGGCCTTGGAAGTCTGGCAGGGCCTGGCACTGCTCTCAGAAGCCATCCTGCGGGG TCAGGCCCTGTTGGCCAACGCCTCCCAGCCATCTGAGACTCCGCAGCTGCATGTGGACAAAGCCGTCAGC AGCCTGCGCAGCCTCACCTCTCTGCTTCGGGCGCTGGGAGCCCAGAAGGAGGCCATGTCCCTTCCAGAGG AAGCCTCTCCTGCTCCACTCCGAACATTCACTGTTGATACTTTGTGCAAACTTTTCCGAATCTACTCCAA TTTCCTCCGTGGAAAGCTGACACTGTACACAGGGGAGGCCTGCAGAAGAGGAGACAGGTGACCAGGTGCT CCCACCCCAGGCACATCCACCACCTCACTCACTACCACTGCCTGGGCCACGCCTCTGCACCACCACTCCT GACCCCTGTCCAGGGGTGATCTGCTCAGCACCAGCCTGTCCCTGTCCCTTGGACACTCCACGGCCAGTGG TGATATCTCAAGGGCCAGAGGAACTGTCCAGAGCTCAAATCAGATCTAAGGATGTCACAGTGCCAGCCTG AGGCCCGAAGCAGGAGGAATTCGGAGGAAATCAGCTCAAACTTGGGGACAGAGCCTTGCTCGGGAGACTC ACCTCGGTGCCCTGCCGAACAGTGATGCCAGGACAAGCTGGAGGGCAATTGCCGATTTTTTGCACCTATC AGGGAGAGACAGGAGAGGCTAGAGAACTAGGTGGCAAGCCATAAATCTTTTAGGCTTCGGGTCTCCTATG ACAGCAAGAGCCCACTGGCAAAGGGGGGGGAGCCATGGAGATGGGATAGGGGCTGGCCCAAAAAAAAAAA AA >gi|54792750|ref|NP_001006647.1| erythropoietin [Canis lupus familiaris] MCEPAPPKPTQSAWHSFPECPALLLLLSLLLLPLGLPVLGAPPRLICDSRVLERYILEAREAENVTMGCA QGCSFSENITVPDTKVNFYTWKRMDVGQQALEVWQGLALLSEAILRGQALLANASQPSETPQLHVDKAVS SLRSLTSLLRALGAQKEAMSLPEEASPAPLRTFTVDTLCKLFRIYSNFLRGKLTLYTGEACRRGDR Comparison using blastn: Comparison using blastp: iv. Retrieve at least 5 homologenes of this gene. Perform a multiple sequence alignment? The human sequence is most similar to what organism? Result of multiple sequence alignment: Human sequence is most similar to that of Pan troglodytes. v. Is the secondary structure of this protein knowetweenn? If so, how many “helical fold”are there in its 3D protein structure? How did you determine the exact amino acid number of each helical region? Yes. 4 helical folds are in the 3D protein structure. Sequence details provided by PDB website shows the sequences of the helical structures. vi. Is the function of this protein known? If so, what does it do? The functions of the protein are listed below: vii. Which normal human tissues is this gene mainly expressed in? How did you determine this? This gene mainly expresses in eye and prostate. viii. Is this protein involved in any biological pathway(s)? If so, what does the pathway do? EPO is involved in three pathways as described in the website: (1) Cytokine-cytokine receptor interaction (2) Hematopoietic cell lineage (3) Jak-STAT signaling pathway ix. Do any other databases contain information about the superfamily of this target gene product? Which superfamily? How did you find out? x. Look for publications relevant to the function(s) of this protein in the biomedical literature. Show one abstract of a relevant article. Below is the abstract of the relevant article: xi. Show the protein 3-D structure if there is any. 1. Find the zebra fish homolog of the above gene. And answer the following questions: i. The zebra fish homolog is located on which chromosome? And in Human? The gene is located on chromosome 7 in zebra fish. In human, the gene is also located on chromosome 7. ii. Perform a cDNA and Polypeptide sequence alignment between human and zebra fish of this gene. Blastn: Blastp: iii. How many exons does this gene have in zebrafish? How did you determine this? There are 5 exons within the gene sequence as described in the website. iv. What is the expression pattern of this gene in zebrafish? In human? In mouse?