* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download end of semester main examination - UR-CST
Human genome wikipedia , lookup
Polycomb Group Proteins and Cancer wikipedia , lookup
Gene nomenclature wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Sequence alignment wikipedia , lookup
Protein moonlighting wikipedia , lookup
Expanded genetic code wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Reg. No: 0328/11 KIGALI INSTITUTE OF SCIENCE AND TECHNOLOGY INSTITUT DES SCIENCES ET TECHNOLOGIE Avenue de l'Armée, B.P. 3900 Kigali, Rwanda INSTITUTE EXAMINATIONS – ACADEMIC YEAR 2012-2013 END OF SEMESTER MAIN EXAMINATION FACULTY OF SCIENCE SCIENCE-4-CHEMISTRY FOURTH YEAR, SEMESTER II CHE 3443 Proteins, Enzymes & Bioinformatics DATE: 30/04/2013 TIME: 3 HOURS MAXIMUM MARKS = 60 INSTRUCTIONS 1. This paper contains TWO sections 2. Answer Question ONE in Section A, and any TWO out of THREE questions in Section B of which question FOUR is compulsory. 3. To answer question FOUR, create a word document and e-mail the file at [email protected] 4. No written materials allowed into the examination room. 5. Write all answers in the booklet provided. 6. Do not forget to write your Registration Number. 7. All Questions carry a maximum of 20 Marks each Section A 1. i. State the use of the following databases in bioinformatics: Pfam, PDB, the DALI, and Genbank. (5 marks) ii. Explain what is meant by specialized databases. (2 marks) iii. Give at least three examples of specialized databases. (3 marks) iv. What is the type of information contained in specialized databases? (4 marks) v. Name 3 protein databases. (3 marks) vi. What is cross-referencing in databases. Why is it important for a user? (3 marks) Section B 2. i. Fill out the dynamic programming table for determining the optimum global alignment between sequences CGGA and ACTG. Assume that a match is scored +3 and that mismatches and spaces are scored 1 each. (12 marks) ii. Determine the optimum alignment corresponding to the table in part (ii) and its score. (8marks) 3. i. There are different amino acid scoring matrices, with PAM and BLOSUM being the most common matrices used for protein sequence comparison. What is the choice for an appropriate score matrix? (5 marks) ii. Describe the major categories of bioinformatic tools. (15 marks) 4. i. Find the amino acid sequence of the human myoglobin protein. a. Write down the amino acid sequence. (2 marks) b. What is the length of the sequence? (1 mark) c. Which database did you use to find the information? (1 mark) d. On which chromosome is the gene coding for this protein located in the human genome? (1 mark) e. In which tissues the gene encoding this protein is predominantly expressed? (1 marks) f. What is the function of this protein? (2 marks) ii. Find the amino acid sequence of the mouse myoglobin protein. a. Write down the amino acid sequence. (2 marks ) b. On which chromosome is the gene coding for this protein located in the mouse genome? (1 mark) 1 iii. Now search for homologs of human myoglobin protein. To do this, go EBI website, choose SERVICES/PROTEINS/NCBI BLAST [protein]. After blasting the human myoglobin sequence, list the homologs with E value of 110109. Also, next to each homolog, place the name of the species it came from, % identities and the E value. (5 marks) iv. Finally, analyze the human myoglobin protein sequence as compared to its closest homologs found in (iii). To do this, you need to use the sequences for these homologs in their FASTA format, and compare them using Clustalw. a. How many amino acid residues are not conserved in all the aligned sequences? (1 marks) b. From the data obtained, complete the following table. (3 marks) Seq A 1 1 1 1 2 2 2 3 Name Seq B 2 3 4 5 3 4 5 4 2 Name Score