* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Bioinformatics Individual Projects
Survey
Document related concepts
Designer baby wikipedia , lookup
Genetic code wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Metagenomics wikipedia , lookup
Microevolution wikipedia , lookup
Gene nomenclature wikipedia , lookup
Smith–Waterman algorithm wikipedia , lookup
Protein moonlighting wikipedia , lookup
Frameshift mutation wikipedia , lookup
Helitron (biology) wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Multiple sequence alignment wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Transcript
Bioinformatics Individual Projects Directions for getting started on the Individual Projects 1. Obtain the gene name and pdb number for your project. We won’t use the pdb number until next week. 2. Open two word documents - one to type and copy/paste information about your project as you work and the other to collect FASTA protein sequences for an alignment 3. Use the KRas tutorial to find step-by-step directions for collecting information about your gene. You should go to the same databases and look at the same types of information that we did for KRas but collect information about your gene instead. You should be collecting information to put into a report about your gene and its connection to a genetic disease. Your report should ultimately include an explanation for the link from genotype to phenotype for the SNP that is given to you in the mutant sequence. a. NCBI-Gene – copy the wildtype protein sequence (FASTA) into your word sequence document b. OMIM c. UniProt – ExPASy – record information about protein location and function as well as information under the “Features” section d. KEGG – look at upstream and downstream events around your protein and predict what would happen if your protein were more and less active e. Go to Bio3055 site and get the mutant cDNA sequence for your project f. Translate the mutant sequence using EMBOSS sixpack or EMBOSS transeq tools g. Use the wildtype protein sequence and BLAST to obtain 4 more homologous protein sequences for your multiple sequence alignment. Copy those 4 FASTA formatted sequences to your Word sequence file too h. Use ClustalW to align all 6 sequences (wildtype, mutant, plus 4 homologous sequences) i. Save the two Word files and your ClustalW files to a flashdrive or email them to yourself. j. Identify the mutation by comparing the wildtype and mutant sequences. Next time 1. Open the pdb file in Firstglance in Jmol 2. Find the article that corresponds with your proteins crystal structure and read a little about their structure/function analysis. 3. Find the amino acid position that is mutated in the structure and predict what happens to the protein’s function when the mutation occurs. 4. Create and save a picture of the structure that shows the mutant position. Assembling your report: Your report should include the following parts and will probably be around 3 -5 pages typed, double-spaced with figures and the information below: a. OMIM, Gene, KEGG, and UniprotKB information – write about in summary and reference websites at end of paper in bibliography b. Figures i. Pretty plot figure of multiple sequence alignment – mark on this the mutation your project focuses on and some features from the Uniprot entry ii. Firstglance in Jmol – showing the role of the mutated amino acid Answer key for individual projects: Gene name for Homo sapiens Pdb Mutation in alignment KRAS CYP1A1 MTATP6 MTCO1 LDLR HMGCR HPRT1 PAH SOD1 CASP1 1AGP 1OG5 1C17 1OCC 1N7D 1HW9 1BZY 1KW0 1B4L 1IBC G12C I462V L156R +KEK at end W166S D690A D194N E280K H46R H216D Corresponding mutation in pdb ortholog D12 L440 L207 K516 W144 D690 D193 E280 H46 H237 Note: Sometimes the numbering is different in the structure because it is an ortholog and sometimes it is because a leader sequence is cleaved (LDLR) or the initial Met is cleaved (HPRT1).