Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Wrong assumptions and misinterpretations
in explanations of biological models,
phenomena and processes
or
Is biologist logical,
and computer scientist alive?
Jacek Leluk
ICM UW
How is it, that your genome is in 98% the
same as genome of chimpanzee and
only in 50% as your own father’s
genome?
"O składności członów człowieczych"
Dlaczego ptacy mleka nie dają?
Bo musiałyby mieć cyce, które by im
wadziły ku lataniu.
Andrzej z Kobylina (XVI w.)
Is biology „bilogical”?
Nomenclature chaos:
• Mitochondria or chondriosomes?
• Is papain a proteolytic enzyme?
• definition of identity, similarity an homology
Misinterpretaion:
• Amino acid sequence of gene?
• Why squash inhibitors are inhibitors?
• Is wheat aglutinin to aglutinate rabbit red cells?
Incomplete knowledge
• Stochastic index matrices
• Statistical description of biological processes
The problem of terminology
• BPTI
- Basic Pancreatic Trypsin Inhibitor
- Bovine Pancreatic Trypsin Inhibitor
- Basic Protein Trypsin Inhibitor
• PAM
- Point Accepted Mutations
- Percent Accepted Mutations
• Kunitz trypsin inhibitor
- BPTI - mammalian organs
- STI - soybean trypsin inhibitor
What may everybody do wrong?
Monte Carlo approach in structure analysis and prediction – what state do we predict?
Mathematical modelling of life processes –
- Markov chains and protein evolution and differentiation
- significance similarity estimation
What may biologists do wrong?
Amino acids and proteins –
- do proteins consist of amino acids as we describe?
Definitions and theory –
- definition of species and theory of evolution
- definitions and biology
Correlated mutations –
- dispersed correlation
What may theoreticians do wrong?
Primitive or ancestral? –
- (Cyanophyta, Archaebacteria, ape and human)
Global and local energy minima –
- can we predict the exact conformation at exact time?
Microscopic/mesoscopic/macroscopic processes - water molecule and tsunami
Assumptions and conclusions –
- incomplete assumptions and wrong conclusions
- deformations by simplifying
- is the protein sequence just a string of characters?
Sequence identity estimation
in proteomics and genomics
Identity threshold – does it make sense?
WHAT IS IMPORTANT IN THE PROTEIN SIMILARITY SEARCH ?
WHAT IS IMPORTANT IN THE PROTEIN SIMILARITY SEARCH ?
1) Contribution (%) of identical positions
WHAT IS IMPORTANT IN THE PROTEIN SIMILARITY SEARCH ?
1) Contribution (%) of identical positions
PKILMECKKD 8
PKILMECKKD 2
1) Contribution
(%)
identical
P K I LIS
M KIMPORTANT
Cof
KH
D 8 0 % positions
S D PROTEIN
C L L D C V C L SIMILARITY
20%
WHAT
IN THE
SEARCH ?
similar
not similar
PKILMECKKD 8
PKILMECKKD 2
PKILMKCKHD 80%
SDCLLDCVCL 20%
2) Length of
comparedsimilar
strings
(sequences)
1) the
Contribution
(%)
of identical
positions
not similar
2) Length of the
compared strings
(sequences)
LCE 1
M V EI C I E P K I R C I K V C T K D E R I T C L I L D ET
PKILMECKKD 8
P K I L M E C K K D8 2
2)WCG
Length
of the compared
strings
(sequences)
33.3%
M
V
Y
WC
P
R
R
F
M
H
C
V
H
L
K
A
G
G
C
T
PKILMKCKHD 80%
SCDWCCLLLRDLCDVYCYL 2260%%
casual
probably similar
similar
similar
LCE 1
M V EI C I E P K I R C I K V C T K Dnot
ER
I T C L I L D ET 8
WCG 33.3%
M V Y WC P R R F M H C V H L K A G G C T C W C L R L D Y Y 2 6 %
3) Distribution
of theofidentical
positions
along
the analyzed
2) Length
the compared
strings
(sequences)
casual
probably
similar sequence
3)
MVEMICIEPKIRCIKVCTKDERITL
5
5 I L D ET 8
LCEof1the identical
M V EI MVEMIMAGDARCIKVCTKDERITCL
C Ialong
E P K Ithe
R C analyzed
I K V C T K sequence
DERITCL
3)
Distribution
positions
HVYYWRPERFMHTVKLKAGGCRCWL
20% M V Y WC
HHYYWMAGDAHTVQLKAGGCWCWAG
20%
WCG 33.3%
PRRFMHCVHLKAGGCTCWCL
RLDYY 26%
casual casual
similar
probably
similar
MVEMICIEPKIRCIKVCTKDERITL
5 alongMVEMIMAGDARCIKVCTKDERITCL
Distribution
of the identical positions
the analyzed sequence5
HVYYWRPERFMHTVKLKAGGCRCWL 20%
HHYYWMAGDAHTVQLKAGGCWCWAG 20%
4) Residues3)atDistribution
conservative
ofpositions
the identical positions along thesimilar
analyzed sequence
casual
MVCPKILMKCKHDSDCLLDCVCLED
MVCPKILMKCKHDSDTLLDCVCLED
MVEMICIEPKIRCIKVCTKDERITL
5
MVEMIMAGDARCIKVCTKDERITCL 5
E D E G4)
K RResidues
R T K R E HatFconservative
K E S N L A A A positions
FKEQ
QNCPGPREWCFTTRMNDSSCACPQT
HVYYWRPERFMHTVKLKAGGCRCWL 20%
HHYYWMAGDAHTVQLKAGGCWCWAG 20%
not similar
similar
M V C P K I L M K Ccasual
KHDSDCLLDCVCLED
M V C P K I L M Ksimilar
CKHDSDTLLDCVCLED
EDEGKRRTKREHFKESNLAAAFKEQ
QNCPGPREWCFTTRMNDSSCACPQT
5) Structural/genetic
similarity of the amino acids at non-conservative
not similar
similar positions
4) Residues
at conservative positions
Identity only
EDEGKRRTKREHFKESNLAAAFKEQ
QNCPGPREWCFTTRMNDSSCACPQT
CRRLVKRCRKETECIVECICIDE
notR Lsimilar
similar
Identity only
M V C P K I L M K C K H D S D C L L D CGenetic
VCLED
Structural
M V C P K I L M K C Ksimilarity
H D S D C L L D C V amino
CLED
M V C P K I L M K C K H D Spositions
DTLLDCVCLED
5) Structural/genetic
M V C P K I L M KofC the
K H D S D C Lacids
L D C VatC non-conservative
LED
2) Length of the compared strings (sequences)
PKILMECKKD 8
PKILMECKKD 2
LCE 1 P K I L M K C KMHVDEI 8
C0
I%
E P K I R C I KS
VD
CC
TL
KL
DD
EC
RV
IC
TL
C L2
I0
L%
D ET 8
WCG 33.3% similar M V Y WC P R R F M H C V H L Knot
AG
GCTCWCLRLDYY 26%
similar
casual
probably similar
WHAT IS IMPORTANT IN THE PROTEIN SIMILARITY SEARCH ?
2)
of theofcompared
strings
(sequences)
3) Length
Distribution
the identical
positions
along the analyzed sequence
4)
LCE 1
M V EI C I E P K I R C I K V C T K D E R I T C L I L D ET 8
MVEMICIEPKIRCIKVCTKDERITL 5
MVEMIMAGDARCIKVCTKDERITCL 5
WCG 33.3%
M V Y WC P R R F M H C V H L K A G G C T C W C L R L D Y Y 2 6 %
20%
HHYYWMAGDAHTVQLKAGGCWCWAG 20%
Residues HVYYWRPERFMHTVKLKAGGCRCWL
at thecasual
conservative positions
probably similar
casual
similar
3) Distribution of the identical positions along the analyzed sequence
4) Residues at conservative positions
MVEMICIEPKIRCIKVCTKDERITL
5C L E D
MVCPKILMKCKHDSDCLLDCV
HVYYWRPERFMHTVKLKAGGCRCWL
20%
EDEGKRRTKREHFKESNLAAA
FKEQ
casual
not similar
MVEMIMAGDARCIKVCTKDERITCL
M V C P K I L M K C K H D S D T L L D C5V C L E D
HHYYWMAGDAHTVQLKAGGCWCWAG
Q N C P G P R E W C F T T R M N D S S C20%
ACPQT
similarsimilar
4)
at conservative
positions
5) Residues
Structural/genetic
similarity
of the amino acids at non-conservative positions
5)
MVCPKILMKCKHDSDCLLDCVCLED
MVCPKILMKCKHDSDTLLDCVCLED
Identity only
EDEGKRRTKREHFKESNLAAAFKEQ
QNCPGPREWCFTTRMNDSSCACPQT
MVCPKILMKCKHDSDCLLDCVCLED
similar
similar
Structural/geneticnot
similarity
the
positions
R of
LCR
R L Vamino
K R C R K Eacids
T E C I Vat
E Cnon-conservative
ICIDE
5) Structural/genetic
similarity of the amino acids at non-conservative
positions
Structural
Genetic
MVCPKILMKCKHDSDCLLDCVCLED
MVCPKILMKCKHDSDCLLDCVCLED
R L C R R L V K R C R K E T E C I V E C I C I D Identity
E
Ronly
LCRRLVKRCRKETECIVECICIDE
MVCPKILMKCKHDSDCLLDCVCLED
RLCRRLVKRCRKETECIVECICIDE
Structural
MVCPKILMKCKHDSDCLLDCVCLED
RLCRRLVKRCRKETECIVECICIDE
Genetic
MVCPKILMKCKHDSDCLLDCVCLED
RLCRRLVKRCRKETECIVECICIDE
Sequence multiple alignment
Problem of gap manipulation
Any protein can be aligned with each other
as homologous/similar
anybiologicalstring
anybilogicalstrip
anyprotein
canbealigned
anybiologicalstri-ng
anybi-logicalstrip
-an-yprote--i-n
canb-----ealigned
Statistical approaches vs. accuracy
How far may they be improved?
Protein secondary structure prediction – accuracy 70-72%
(not much changed since 1978)
100% accuracy requires the complete database for all
possible structures.
For 30 AA polypeptides – 2030 sequences/secondary
structures
Searching the database for appropriate
sequence/structure with the rate 1012 sequences/sec.
would proceed 1.8 bilion times longer than the age
of the Universe.
Genetic conditioning of the amino acid
replacement probabilities and spectrum
in molecular evolution
The Markov model assumes that the substitution probability of
amino acid AA1 by AA2 is the same, regardless of what the initial
residue AA1 was transformed from (AAx, AAy)
AAx
AAy
AA1
AA1
Pa
Pb
AA2
AA2
Pa = Pb
The currently used statistical algorithms are based on Markovian
model of the amino acid replacement (they directly use stochastic
matrices of replacement frequency indices)
BLOSUM62 matrix of amino acid replacements
A
R
N
D
C
Q
E
G
H
I
L
K
M
F
P
S
T
W
Y
V
4
-1
-2
-2
0
-1
-1
0
-2
-1
-1
-1
-1
-2
-1
1
0
-3
-2
0
A
5
0
-2
-3
1
0
-2
0
-3
-2
2
-1
-3
-2
-1
-1
-3
-2
-3
R
6
1
-3
0
0
0
1
-3
-3
0
-2
-3
-2
1
0
-4
-2
-3
N
6
-3
0
2
-1
-1
-3
-4
-1
-3
-3
-1
0
-1
-4
-3
-3
D
Why tryptophane is here
the most conservative residue?
9
-3
-4
-3
-3
-1
-1
-3
-1
-2
-3
-1
-1
-2
-2
-1
C
5
2
-2
0
-3
-2
1
0
-3
-1
0
-1
-2
-1
-2
Q
5
-2
0
-3
-3
1
-2
-3
-1
0
-1
-3
-2
-2
E
6
-2
-4
-4
-2
-3
-3
-2
0
-2
-2
-3
-3
G
8
-3
-3
-1
-2
-1
-2
-1
-2
-2
2
-3
H
4
2
-3
1
0
-3
-2
-1
-3
-1
3
I
4
-2
2
0
-3
-2
-1
-2
-1
1
L
5
-1
-3
-1
0
-1
-3
-2
-2
K
5
0
-2
-1
-1
-1
-1
1
M
6
-4
-2
-2
1
3
-1
F
7
-1
-1
-4
-3
-2
P
4
1 5
-3 -2 11
-2 -2 2 7
-2 0 -3 -1 4
S T W Y V
Replacemant Arg  Lys according to the statistical
interpretation using stochastical matrix indices
Arg
PAM250
3
BLOSUM62
2
BLOSUM35
2
BLOSUM45
3
BLOSUM100
3
Lys
Arginine-to-lysine mutational replacements
Met
Arg
Lys
ATG
AGG
AAG
Gln
Leu
Arg
CTR
CGR
CAR
Lys
Arg
AAR
AGR
Ser
His
Arg
CAY
CGY
AGY
Arg
CGR
Arg
Lys
AGR
AAR
Possible one-point-mutational processing of serine with
respect to its origin
Trp
Asn
UGG
AAU
Ser
Ser
UCG
AGU
Thr
Ala
Pro
Thr
Ile
Asn
Ser
Trp
Leu
Ser
Arg
Cys
(UAG)
Gly
Is arginine the same as arginine?
Possible codons for arginine:
AGA AGG CGA CGG CGC CGT
Diagram
Diagram
of of
amino
codon
acid
genetic
genetic
relationships
relationships
K
AAA
E
GAA
K
AAG
E
GAG
N
AAC
R
AGA
1
D
GAU
T
ACA
I
AUA
M
AUG
I
AUC
A
GCU
S
UCC
P
CCU
L
CUA
L
UUG
L
CUC
V
GUU
S
UCU
L
UUA
L
CUG
V
GUC
I
AUU
S
UCG
P
CCC
V
GUG
C
UGU
S
UCA
P
CCG
A
GCC
V
GUA
C
UGC
R
CGU
P
CCA
T
ACU
W
UGG
R
CGC
A
GCG
T
ACC
Y
UAU
–
UGA
G
GGU
A
GCA
T
ACG
H
CAU
R
CGG
G
GGC
S
AGU
3
Y
UAC
R
CGA
G
GGG
S
AGC
2
H
CAC
G
GGA
R
AGG
–
UAG
Q
CAG
D
GAC
N
AAU
AGCU
–
UAA
Q
CAA
F
UUC
L
CUU
F
UUU
Genetic relationships between Arg and Met/Gln
K
Q
E
K
Q
E
N
D
N
AGCU
1
R
D
R
H
–
G
S
R
A
P
T
P
T
A
S
P
L
V
L
L
V
S
L
L
V
I
S
P
V
I
C
S
A
M
C
R
A
I
W
G
T
Y
R
G
T
Y
R
S
2
–
H
G
3
–
F
L
F
What part of the codon contains the information about the
previous amino acid that occurred at certain position of the
protein sequence?
At most 2/3 of the entire codon.
Ala
Val
GCG
GUG
How long is the information about codons of preceeding
amino acids stored?
The shortest storage period is 3 transitions/transversions
Ala
Val
Met
Ile
GCG
GUG
AUG
AUA
Ser
Ser
Thr
Ser
UCC
UCU
ACU
AGU
Theoreticaly the longest period is infinite
Lys
Asn
Asp
His
Gln
Glu
Asp
AAA
AAC
GAC
CAC
CAG
GAG
GAU
Tyr
His
Asn
Lys
Gln
His
UAU
CAU
AAU
AAG
CAG
CAC
...
Correlated mutations
The phenomenon of several mutations occurring simultaneously
and dependent on each other
According to the current hypothesis of molecular positive Darwinian
selection, correlated mutations are related to the changes
occurring in their neighborhood, they reflect the protein-to-protein
interaction and they preserve the biological activity
and structural properties of the molecule
The current explanation of correlated mutations
occurrence (example)
Trp
Val
CH2
CH
H3C CH3
HN
CH3
CH3
Val
Ala
CH
H3C CH3
Trp
CH2
H3C
CH3
HN CH
CH2
Leu
H3C
CH3
CH
CH2
Leu
Ala
The three types of distribution of correlated positions present in
myoglobins
The residue location and relative distribution is shown on tertiary structure of human
myoglobin (P0244, pdb1bzp)
The spot correlation cluster
Position no. and occurring
residues
127 [AMSTV]
27
[ADEFLNT]
31
78
Correlation
versus position 127
A (58)
S (7)
ADEFNT
E
[GKRS]
GKRS
R
[AKLQ]
K
ALQ
DEGT
E
AEHKQS
A
AEKQS
E
BDEN
D
109 [DEGNT]
116 [AEHKQST]
117 [AEKNQS]
122 [BDEN]
The three types of distribution of correlated positions present
in Bowman-Birk inhibitor family
The residue location and relative distribution is shown on tertiary structure of BowmanBirk inhibitor from soybean (P01055)
The narrow correlation cluster
Position no. and
occurring residues
13 [–ADFIKLMPRSTV]
Correlation
versus position 13
L (11)
M
(10)
A (8)
4 [–RSTVY]
V
–S
S
5 [–KPST]
K
–S
S
7 [AEGKP]
A
P
P
11 [EFHIKLQRST]
T
EHQ
S
21 [EFIKMQT]
T
Q
EQ
The three types of distribution of correlated positions present in eglin-like
proteins.
The residue location and relative distribution is shown on tertiary structure of eglin C
(P01051)
The dispersed correlation
Position no. and
occurring residues
67 [–DGNT]
10 [–ELNQRST]
Correlation
versus position 67
D (8)
G (9)
ET
LNQRS
The three types of distribution of correlated positions present in
lysozymes
The residue location and relative distribution is shown on tertiary structure of lysozyme
from rat (P00697, pdb5lyz)
The dispersed correlation
Position no. and occurring
residues
80 [GHKNR]
Correlation
versus position 80
G (7)
H (31)
N (16)
30 [ILMV]
MV
ILMV
V
40 [DFKNR]
DN
N
FKNR
The observed number and contribution of three correlation types in four
different protein families
The correlation sets consist of 2 to over 20 residues
The correlation statistics
The protein
family (number of
correlated
positions/set)
Total
number of
correlation
sets
observed
Number of
dispersed
sets
Number of
narrow
clusters
Number of
undirected
clusters
Number of
sets
related to
active
center
20
7
7
6
1
23
4
13
6
9
Myoglobins (229)
41
23
9
9
n.a.
Lysozymes (2-15)
41
25
9
7
9
All families
125 (100%)
59 (47.2%)
38 (30.4%)
28 (22.4%)
-
Eglin-like
proteins (2-13)
Bowman-Birk
proteinase
inhibitors (2-28)
A mathematician – biologist dialogue
The communication problem
Bowls are
convex
Bowls are
concave
In entire splendour of natural phenomena...
...not always the first conclusion is correct
and the first impression consistent with the reality
P01055
P01057
P01056
P01058
P01059
P01063
P17734
P81483
P81484
P16343
P01064
P82469
P01061
P01062
P01060
1BBI:
1D6R:I
1DF9:C
1PI2:
1PBI:A
AAB4719
TISYC2
JC2225
TIZB2
JC2073
JC2072
0506164
0401177
763679A
TISYD2
0907248
1102213
1102213
0404180
TIZB1B
TIMB
TIZB1P
JC1066
Q41066
P80321
Q41065
P81705
P56679
P16346
P01065
P24661
P07679
P19860
P22737
220645
P09864
P09863
3
10
20
30
40
50
60
ESSKPCCDQCACTKSNPPQCRCSDMRLNSCHSACKSCICALSYPAQCF-CVDITDFCYEP-CKP
ESSKPCCDECACTKSIPPQCRCTDVRLNSCHSACSSCVCTFSIPAQCV-CVDMKDFCYAP-CKS
QSSKPCCBHCACTKSIPPQCRCTDLRLDSCHSACKSCICTLSIPAQCV-CBBIBDFCYEP-CKS
ESSKPCCDQCSCTKSMPPKCRCSDIRLNSCHSACKSCACTYSIPAKCF-CTDINDFCYEP-CKS
ESSKPCCDLCTCTKSIPPQCHCNDMRLNSCHSACKSCICALSEPAQCF-CVDTTDFCYKS-CHN
ESSKPCCDLCMCTASMPPQCHCADIRLNSCHSACDRCACTRSMPGQCR-CLDTTDFCYKP-CKS
QSSKPCCRQCACTKSIPPQCRCSQVRLNSCHSACKSCACTFSIPAQCF-CGBIBBFCYKP-CKS
-SSKPCCBHCACTKSIPPQCRCSBLRLNSCHSECKGCICTFSIPAQCI-CTDTNNFCYEP-CKS
-SSKPCCBHCACTKSIPPQCRCSBLRLNSCHSECKGCICTFSIPAQCI-CTDTNNFCYEP-CKS
ESSKPCCSSC-CTRSRPPQCQCTDVRLNSCHSACKSCMCTFSDPGMCS-CLDVTDFCYKP-CKS
EYSKPCCDLCMCTRSMPPQCSCEDIRLNSCHSDCKSCMCTRSQPGQCR-CLDTNDFCYKP-CKS
-SSGPCCDRCRCTKSEPPQCQCQDVRLNSCHSACEACVCSHSMPGLCS-CLDITHFCHEP-CKS
ESSHPCCDLCLCTKSIPPQCQCADIRLDSCHSACKSCMCTRSMPGQCR-CLDTHDFCHKP-CKS
ESSEPCCDSCDCTKSIPPECHCANIRLNSCHSACKSCICTRSMPGKCR-CLDTDDFCYKP-CES
QSSPPCCBICVCTASIPPQCVCTBIRLBSCHSACKSCMCTRSMPGKCR-CLBTTBYCYKS-CKS
ESSKPCCDQCACTKSNPPQCRCSDMRLNSCHSACKSCICALSYPAQCF-CVDITDFCYEP-CKP
---KPCCDQCACTKSNPPQCRCSDMRLNSCHSACKSCICALSYPAQCF-CVDITDFCYEP-CKESSEPCCDSCDCTKSIPPQCHCANIRLNSCHSACKSCICTRSMPGKCR-CLDTDDFCYKP-CES
EYSKPCCDLCMCTRSMPPQCSCED-RINSCHSDCKSCMCTRSQPGQCR-CLDTNDFCYKP-CKS
DVKSACCDTCLCTKSNPPTCRCVDVGET-CHSACLSCICAYSNPPKCQ-CFDTQKFCYKQ-CHN
ESSKPCCDQCTCTKSIPPQCRCTDVRLNSCHSACSSCVCTFSIPAQCV-CVDMKDFCYAP-CKS
ESSKPCCDLCMCTASMPPQCHCADIRLNSCHSACDRCACTRSMPGQCR-CLDTTDFCYKP-CKS
ESSKPCCDLCMCTASMPPQCHCADIRLNSCHSACDRCACTRSMPGQCR-CLDTTDFCYKP-CKS
ESSKPCCDQC-CTKSMPPKCRCSDIRLDSCHSACKSCACTYSIPAKCF-CTDINDFCYEP-CKS
ESSKPCCDECKCTKSEPPQCQCVDTRLESCHSACKLCLCALSFPAKCR-CVDTTDFCYKP-CKS
ESSKPCCDECKCTKSEPPQCQCVDTRLESCHSACKLCLCALSFPAKCR-CVDTTDFCYKP-CKS
ESSKPCCDQC-CTKSMPPKCRCSDIRLDSCHSACKSCACTYSIPAKCF-CTDINDFCYEP-CKS
ESSKPCCDLCMCTASMPPQCHCADIRLNSCHSACDRCACTRSMPGQCR-CLDTTDFCYKP-CKS
ESSKPCCDLCMCTASMPPQCHCADIRLNSCHSACDRCACTRSMPGQCR-CLDTTDFCYKP-CKS
EYSKPCCDLCMCTRSMPPQCSCEDIRLNSCHSDCKSCMCTRSQPGQCR-CLDTNDFCYKP-CKS
ESSEPCCDSCRCTKSIPPQCHCADIRLNSCHSACKSCMCTRSMPGKCR-CLDTDDFCYKP-CES
ESSEPCCDLCLCTKSIPPQCQCADIRLNSCHSACKSCMCTRSMPGQCH-CLDTHDFCHKP-CKS
ESSEPCCDLCLCTKSIPPQCQCADIRLNSCHSACKSCMCTRSMPGQCR-CLDTHDFCHKP-CKS
EYSKPCCDLCMCTRSMPPQCSCEDIRLNSCHSDCKSCMCTRSQPGQCR-CLDTNDFCYKP-CKS
ESSHPCCDLCLCTKSIPPQCQCADIRLDSCHSACKSCMCTRSMPGQCH-CLDTHDFCHKP-CKS
ESSEPCCDSCDCTKSKPPQCHCANIRLNSCHSACKSCICTRSMPGKCR-CLDTDDFCYKP-CES
ESSHPCCDLCLCTKSIPPQCQCADIRLNSCHSACKSCMCTRSMPGQCR-CLDTHDFCHKP-CKS
ESSEPCCDSCDCTKSKPPQCHCANIRLNSCHSACKSCICTRSMPGKCR-CLDTDDFCTKP-CES
DVKSACCDTCLCTKSDPPTCRCVDVGET-CHSACDSCICALSYPPQCQ-CFDTHKFCYKA-CHN
STTTACCDFCPCTRSIPPQCQCTDVREK-CHSACKSCLCTLSIPPQCH-CYDITDFCYPS-CRDVKSACCDTCLCTKSNPPTCRCVDVRET-CHSACDSCICAYSNPPKCQ-CFDTHKFCYKA-CHN
--TSACCDKCFCTKSNPPICQCRDVGET-CHSACKFCICALSYPAQCH-CLDQNTFCYDK-CDS
DVKSACCDTCLCTKSNPPTCRCVDVGET-CHSACLSCICAYSNPPKCQ-CFDTQKFCYKA-CHN
--TTACCNFCPCTRSIPPQCRCTDIGET-CHSACKTCLCTKSIPPQCH-CADITNFCYPK-CNDVKSACCDTCLCTRSQPPTCRCVDVGER-CHSACNHCVCNYSNPPQCQ-CFDTHKFCYKA-CHS
DVKSACCDTCLCTKSEPPTCRCVDVGER-CHSACNSCVCRYSNPPKCQ-CFDTHKFCYKS-CHN
KRPWECCDIAMCTRSIPPICRCVDKVDR-CSDACKDCEETEDN--RHV-CFDTYIGDPGPTCHD
ERPWKCCDLQTCTKSIPAFCRCRDLLEQ-CSDACKECGKVRDSDPPRYICQDVYRGIPAPMCHE
ERPWKCCDLQTCTKSIPAFCRCRDLLEQ-CSDACKECGKVRDSDPPRYICQDVYRGIPAPMCHE
ES-EGCCDRCICTKSMPPQCHCHDVRLDSCHSDCETCICTRSYPAQCR-CADTTDFCYKP-C-S
TRPWKCCDRAICTKSFPPMCRCMDMVEQ-CAATCKKCGPATSDSSRRV-CEDXY----------KRPWKCCDQAVCTRSIPPICRCMDQVFE-CPSTCKACGPSVGDPSRRV-CQDQYV----------
Thank you
for your attention
!!!
Related documents