Download Bioinformatics Individual Projects

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene wikipedia , lookup

Designer baby wikipedia , lookup

Mutation wikipedia , lookup

Epistasis wikipedia , lookup

Genetic code wikipedia , lookup

Gene therapy of the human retina wikipedia , lookup

Metagenomics wikipedia , lookup

NEDD9 wikipedia , lookup

Microevolution wikipedia , lookup

Gene nomenclature wikipedia , lookup

Smith–Waterman algorithm wikipedia , lookup

Protein moonlighting wikipedia , lookup

Genomics wikipedia , lookup

Frameshift mutation wikipedia , lookup

Helitron (biology) wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Multiple sequence alignment wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Sequence alignment wikipedia , lookup

Point mutation wikipedia , lookup

Transcript
Bioinformatics Individual Projects
Directions for getting started on the Individual Projects
1. Obtain the gene name and pdb number for your project. We won’t use the pdb number
until next week.
2. Open two word documents - one to type and copy/paste information about your project
as you work and the other to collect FASTA protein sequences for an alignment
3. Use the KRas tutorial to find step-by-step directions for collecting information about
your gene. You should go to the same databases and look at the same types of
information that we did for KRas but collect information about your gene instead. You
should be collecting information to put into a report about your gene and its connection
to a genetic disease. Your report should ultimately include an explanation for the link
from genotype to phenotype for the SNP that is given to you in the mutant sequence.
a. NCBI-Gene – copy the wildtype protein sequence (FASTA) into your word
sequence document
b. OMIM
c. UniProt – ExPASy – record information about protein location and function as
well as information under the “Features” section
d. KEGG – look at upstream and downstream events around your protein and
predict what would happen if your protein were more and less active
e. Go to Bio3055 site and get the mutant cDNA sequence for your project
f. Translate the mutant sequence using EMBOSS sixpack or EMBOSS transeq tools
g. Use the wildtype protein sequence and BLAST to obtain 4 more homologous
protein sequences for your multiple sequence alignment. Copy those 4 FASTA
formatted sequences to your Word sequence file too
h. Use ClustalW to align all 6 sequences (wildtype, mutant, plus 4 homologous
sequences)
i. Save the two Word files and your ClustalW files to a flashdrive or email them to
yourself.
j. Identify the mutation by comparing the wildtype and mutant sequences.
Next time
1. Open the pdb file in Firstglance in Jmol
2. Find the article that corresponds with your proteins crystal structure and read a little
about their structure/function analysis.
3. Find the amino acid position that is mutated in the structure and predict what happens
to the protein’s function when the mutation occurs.
4. Create and save a picture of the structure that shows the mutant position.
Assembling your report:
Your report should include the following parts and will probably be around 3 -5 pages
typed, double-spaced with figures and the information below:
a. OMIM, Gene, KEGG, and UniprotKB information – write about in summary and
reference websites at end of paper in bibliography
b. Figures
i. Pretty plot figure of multiple sequence alignment – mark on this the
mutation your project focuses on and some features from the Uniprot
entry
ii. Firstglance in Jmol – showing the role of the mutated amino acid
Answer key for individual projects:
Gene name for Homo
sapiens
Pdb
Mutation in alignment
KRAS
CYP1A1
MTATP6
MTCO1
LDLR
HMGCR
HPRT1
PAH
SOD1
CASP1
1AGP
1OG5
1C17
1OCC
1N7D
1HW9
1BZY
1KW0
1B4L
1IBC
G12C
I462V
L156R
+KEK at end
W166S
D690A
D194N
E280K
H46R
H216D
Corresponding
mutation in pdb
ortholog
D12
L440
L207
K516
W144
D690
D193
E280
H46
H237
Note: Sometimes the numbering is different in the structure because it is an ortholog and sometimes it
is because a leader sequence is cleaved (LDLR) or the initial Met is cleaved (HPRT1).