Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Virtual
modelling
of proteins
Jacek Leluk
Interdyscyplinarne Centrum Modelowania Matematycznego i Komputerowego, Uniwersytet Warszawski
Main functions of proteins (selected):
Enzymes
Immunoglobulins
Transport factors (e.g.hemoglobin)
Hormones, neurotransmitters
Structural and storage proteins
Contractile proteins (muscles, flagella)
Jacek Leluk
Protein – a polymer of amino acids.
Proteins consists of one or more chains.
Some proteins contain other components (sugars, lipids,
nucleotides, metal ions, other compounds...) – proteids.
The basic unit of a protein is amino acid. There are 20
biogenic amino acids (genetically encoded).
Jacek Leluk
Amino acids
Amino acid – organic compound that contains amino group
and acidic group (usually it is carboxyl group)
General formula
Jacek Leluk
Alanine
Amino acid – polypeptide – protein
Jacek Leluk
Protein chain folding
Jacek Leluk
Diversity of proteins
Glucagon
Jacek Leluk
Insulin
ROP protein
Diversity of proteins
Light „harvesting” protein
from purple bacteria
Jacek Leluk
Sequence – structure - function
At first the central dogma of molecular biology assumed
very strict relationship between genetic information, protein
structure and function:
1 gene
?
1 sequence
?
1 structure
?
1 function
At present this dogma is still valid but not in as strict
form as before. These relationships are not strictly
univocal.
e.g. a protein of the same sequence may reveal
different secondary and tertiary structures.
Jacek Leluk
Sequence – structure - function
All information about protein structure (and function as
well) is included in its amino acid sequence, which is
unique for each protein.
In order to be able to apply these relationships for
protein modelling, first we have to learn to read and
understand the information „written” in amino acid
sequence.
The current level of our understanding this „writing”
depends on the protein complexity and the prediction
accuracy is between 20% and 80%.
Jacek Leluk
What do we have?
Biomolecular databases (genomic, protein and bibliographic)
Tools for theoretical analysis of biomolecules
Labs for experimental verification of the results
Knowledge (theories, hypotheses, theoretical models)
Jacek Leluk
Regular types of structure
(secondary structure)
-helix
Jacek Leluk
helix
Regular types of structure
(secondary structure)
-chain (-sheet)
sheet
Jacek Leluk
3D protein structures
Structure-function relationship
Sea
anemone toxin
Snake toxin
Jacek Leluk
3D protein structures
Structure-function relationship
Bacterial
RNase
Mammalian
RNase
Rnase inhibitor
(inhibits both RNases)
Jacek Leluk
Errors (mutations) and resulting implications
Sickle cell anemia
Sickle cell anemia – genetic disease caused by a single
amino acid substitution in hemoglobin -chain (one of 146).
S hemoglobin has Val instead of Glu in -chain.
Homozygotes (HbSS) are lethal, heterozygotes (Hb AS) are
anemic, but resistant to malaria.
Normal hemoglobin –  chain
VHLTPEEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPK
VKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFG
KEFTPPVQAAYQKVVAGVANALAHKYH
S Hemoglobin –  chain
VHLTPVEKSAVTALWGKVNVDEVGGEALGRLLVVYPWTQRFFESFGDLSTPDAVMGNPK
VKAHGKKVLGAFSDGLAHLDNLKGTFATLSELHCDKLHVDPENFRLLGNVLVCVLAHHFG
KEFTPPVQAAYQKVVAGVANALAHKYH
Jacek Leluk
Mutations and resulting implications
Sickle cell anemia
Hemoglobin
Normal
Jacek Leluk
Altered
Mutations and resulting implications
Sickle cell anemia
Jacek Leluk
Mutations and resulting implications
Sickle cell anemia
Jacek Leluk
Glucagon (pig) – hormone, 29 amino acids
HSQGTFTSDYSKYLDSRRAQDFVQWLMNT
Glucagon (synthetic) – hormone, 29 amino acids
HSQGTFTSDYSKYLDSKKAQEFVQWLMNT
Jacek Leluk
„Gluca con” modelling
Glucagon (pig) – HSQGTFTSDYSKYLDSRRAQDFVQWLMNT
Glucagon (synth.) – HSQGTFTSDYSKYLDSKKAQEFVQWLMNT
Gluca con
LAALIAAVAAAIAAVLRRIAEVLAIVAAL
Hydrophobic amino acids:
L, I, V, F, M, Y, (W)
Jacek Leluk
„Gluca con” design - results
Glucagon (pig) – HSQGTFTSDYSKYLDSRRAQDFVQWLMNT
Glucagon (synth.) – HSQGTFTSDYSKYLDSKKAQEFVQWLMNT
Gluca con
Jacek Leluk
–
LAALIAAVAAAIAAVLRRIXEVLAIVAAL
Can we „improve” the Nature at
molecular level?
What for?
Our goal is to get the knowledge
about natural mechanisms and then
to apply this knowledge for our
needs, but not to alter the evolved
mechanisms that naturally occur.
Jacek Leluk
Role and significance of theoretical
protein modeling and design
Time economy
Money economy
Work and material economy
Increasing our knowledge
Supporting the experimental work
Jacek Leluk
The value of virtual protein design
=
Jacek Leluk
P01055
P01057
P01056
P01058
P01059
P01063
P17734
P81483
P81484
P16343
P01064
P82469
P01061
P01062
P01060
1BBI:
1D6R:I
1DF9:C
1PI2:
1PBI:A
AAB4719
TISYC2
JC2225
TIZB2
JC2073
JC2072
0506164
0401177
763679A
TISYD2
0907248
1102213
1102213
0404180
TIZB1B
TIMB
TIZB1P
JC1066
Q41066
P80321
Q41065
P81705
P56679
P16346
P01065
P24661
P07679
P19860
P22737
220645
P09864
P09863
3
10
20
30
40
50
60
ESSKPCCDQCACTKSNPPQCRCSDMRLNSCHSACKSCICALSYPAQCF-CVDITDFCYEP-CKP
ESSKPCCDECACTKSIPPQCRCTDVRLNSCHSACSSCVCTFSIPAQCV-CVDMKDFCYAP-CKS
QSSKPCCBHCACTKSIPPQCRCTDLRLDSCHSACKSCICTLSIPAQCV-CBBIBDFCYEP-CKS
ESSKPCCDQCSCTKSMPPKCRCSDIRLNSCHSACKSCACTYSIPAKCF-CTDINDFCYEP-CKS
ESSKPCCDLCTCTKSIPPQCHCNDMRLNSCHSACKSCICALSEPAQCF-CVDTTDFCYKS-CHN
ESSKPCCDLCMCTASMPPQCHCADIRLNSCHSACDRCACTRSMPGQCR-CLDTTDFCYKP-CKS
QSSKPCCRQCACTKSIPPQCRCSQVRLNSCHSACKSCACTFSIPAQCF-CGBIBBFCYKP-CKS
-SSKPCCBHCACTKSIPPQCRCSBLRLNSCHSECKGCICTFSIPAQCI-CTDTNNFCYEP-CKS
-SSKPCCBHCACTKSIPPQCRCSBLRLNSCHSECKGCICTFSIPAQCI-CTDTNNFCYEP-CKS
ESSKPCCSSC-CTRSRPPQCQCTDVRLNSCHSACKSCMCTFSDPGMCS-CLDVTDFCYKP-CKS
EYSKPCCDLCMCTRSMPPQCSCEDIRLNSCHSDCKSCMCTRSQPGQCR-CLDTNDFCYKP-CKS
-SSGPCCDRCRCTKSEPPQCQCQDVRLNSCHSACEACVCSHSMPGLCS-CLDITHFCHEP-CKS
ESSHPCCDLCLCTKSIPPQCQCADIRLDSCHSACKSCMCTRSMPGQCR-CLDTHDFCHKP-CKS
ESSEPCCDSCDCTKSIPPECHCANIRLNSCHSACKSCICTRSMPGKCR-CLDTDDFCYKP-CES
QSSPPCCBICVCTASIPPQCVCTBIRLBSCHSACKSCMCTRSMPGKCR-CLBTTBYCYKS-CKS
ESSKPCCDQCACTKSNPPQCRCSDMRLNSCHSACKSCICALSYPAQCF-CVDITDFCYEP-CKP
---KPCCDQCACTKSNPPQCRCSDMRLNSCHSACKSCICALSYPAQCF-CVDITDFCYEP-CKESSEPCCDSCDCTKSIPPQCHCANIRLNSCHSACKSCICTRSMPGKCR-CLDTDDFCYKP-CES
EYSKPCCDLCMCTRSMPPQCSCED-RINSCHSDCKSCMCTRSQPGQCR-CLDTNDFCYKP-CKS
DVKSACCDTCLCTKSNPPTCRCVDVGET-CHSACLSCICAYSNPPKCQ-CFDTQKFCYKQ-CHN
ESSKPCCDQCTCTKSIPPQCRCTDVRLNSCHSACSSCVCTFSIPAQCV-CVDMKDFCYAP-CKS
ESSKPCCDLCMCTASMPPQCHCADIRLNSCHSACDRCACTRSMPGQCR-CLDTTDFCYKP-CKS
ESSKPCCDLCMCTASMPPQCHCADIRLNSCHSACDRCACTRSMPGQCR-CLDTTDFCYKP-CKS
ESSKPCCDQC-CTKSMPPKCRCSDIRLDSCHSACKSCACTYSIPAKCF-CTDINDFCYEP-CKS
ESSKPCCDECKCTKSEPPQCQCVDTRLESCHSACKLCLCALSFPAKCR-CVDTTDFCYKP-CKS
ESSKPCCDECKCTKSEPPQCQCVDTRLESCHSACKLCLCALSFPAKCR-CVDTTDFCYKP-CKS
ESSKPCCDQC-CTKSMPPKCRCSDIRLDSCHSACKSCACTYSIPAKCF-CTDINDFCYEP-CKS
ESSKPCCDLCMCTASMPPQCHCADIRLNSCHSACDRCACTRSMPGQCR-CLDTTDFCYKP-CKS
ESSKPCCDLCMCTASMPPQCHCADIRLNSCHSACDRCACTRSMPGQCR-CLDTTDFCYKP-CKS
EYSKPCCDLCMCTRSMPPQCSCEDIRLNSCHSDCKSCMCTRSQPGQCR-CLDTNDFCYKP-CKS
ESSEPCCDSCRCTKSIPPQCHCADIRLNSCHSACKSCMCTRSMPGKCR-CLDTDDFCYKP-CES
ESSEPCCDLCLCTKSIPPQCQCADIRLNSCHSACKSCMCTRSMPGQCH-CLDTHDFCHKP-CKS
ESSEPCCDLCLCTKSIPPQCQCADIRLNSCHSACKSCMCTRSMPGQCR-CLDTHDFCHKP-CKS
EYSKPCCDLCMCTRSMPPQCSCEDIRLNSCHSDCKSCMCTRSQPGQCR-CLDTNDFCYKP-CKS
ESSHPCCDLCLCTKSIPPQCQCADIRLDSCHSACKSCMCTRSMPGQCH-CLDTHDFCHKP-CKS
ESSEPCCDSCDCTKSKPPQCHCANIRLNSCHSACKSCICTRSMPGKCR-CLDTDDFCYKP-CES
ESSHPCCDLCLCTKSIPPQCQCADIRLNSCHSACKSCMCTRSMPGQCR-CLDTHDFCHKP-CKS
ESSEPCCDSCDCTKSKPPQCHCANIRLNSCHSACKSCICTRSMPGKCR-CLDTDDFCTKP-CES
DVKSACCDTCLCTKSDPPTCRCVDVGET-CHSACDSCICALSYPPQCQ-CFDTHKFCYKA-CHN
STTTACCDFCPCTRSIPPQCQCTDVREK-CHSACKSCLCTLSIPPQCH-CYDITDFCYPS-CRDVKSACCDTCLCTKSNPPTCRCVDVRET-CHSACDSCICAYSNPPKCQ-CFDTHKFCYKA-CHN
--TSACCDKCFCTKSNPPICQCRDVGET-CHSACKFCICALSYPAQCH-CLDQNTFCYDK-CDS
DVKSACCDTCLCTKSNPPTCRCVDVGET-CHSACLSCICAYSNPPKCQ-CFDTQKFCYKA-CHN
--TTACCNFCPCTRSIPPQCRCTDIGET-CHSACKTCLCTKSIPPQCH-CADITNFCYPK-CNDVKSACCDTCLCTRSQPPTCRCVDVGER-CHSACNHCVCNYSNPPQCQ-CFDTHKFCYKA-CHS
DVKSACCDTCLCTKSEPPTCRCVDVGER-CHSACNSCVCRYSNPPKCQ-CFDTHKFCYKS-CHN
KRPWECCDIAMCTRSIPPICRCVDKVDR-CSDACKDCEETEDN--RHV-CFDTYIGDPGPTCHD
ERPWKCCDLQTCTKSIPAFCRCRDLLEQ-CSDACKECGKVRDSDPPRYICQDVYRGIPAPMCHE
ERPWKCCDLQTCTKSIPAFCRCRDLLEQ-CSDACKECGKVRDSDPPRYICQDVYRGIPAPMCHE
ES-EGCCDRCICTKSMPPQCHCHDVRLDSCHSDCETCICTRSYPAQCR-CADTTDFCYKP-C-S
TRPWKCCDRAICTKSFPPMCRCMDMVEQ-CAATCKKCGPATSDSSRRV-CEDXY----------KRPWKCCDQAVCTRSIPPICRCMDQVFE-CPSTCKACGPSVGDPSRRV-CQDQYV----------
Related documents