Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Wrong assumptions and misinterpretations in explanations of biological models, phenomena and processes or Is biologist logical, and computer scientist alive? Jacek Leluk ICM UW How is it, that your genome is in 98% the same as genome of chimpanzee and only in 50% as your own father’s genome? "O składności członów człowieczych" Dlaczego ptacy mleka nie dają? Bo musiałyby mieć cyce, które by im wadziły ku lataniu. Andrzej z Kobylina (XVI w.) Is biology „bilogical”? Nomenclature chaos: • Mitochondria or chondriosomes? • Is papain a proteolytic enzyme? • definition of identity, similarity an homology Misinterpretaion: • Amino acid sequence of gene? • Why squash inhibitors are inhibitors? • Is wheat aglutinin to aglutinate rabbit red cells? Incomplete knowledge • Stochastic index matrices • Statistical description of biological processes The problem of terminology • BPTI - Basic Pancreatic Trypsin Inhibitor - Bovine Pancreatic Trypsin Inhibitor - Basic Protein Trypsin Inhibitor • PAM - Point Accepted Mutations - Percent Accepted Mutations • Kunitz trypsin inhibitor - BPTI - mammalian organs - STI - soybean trypsin inhibitor What may everybody do wrong? Monte Carlo approach in structure analysis and prediction – what state do we predict? Mathematical modelling of life processes – - Markov chains and protein evolution and differentiation - significance similarity estimation What may biologists do wrong? Amino acids and proteins – - do proteins consist of amino acids as we describe? Definitions and theory – - definition of species and theory of evolution - definitions and biology Correlated mutations – - dispersed correlation What may theoreticians do wrong? Primitive or ancestral? – - (Cyanophyta, Archaebacteria, ape and human) Global and local energy minima – - can we predict the exact conformation at exact time? Microscopic/mesoscopic/macroscopic processes - water molecule and tsunami Assumptions and conclusions – - incomplete assumptions and wrong conclusions - deformations by simplifying - is the protein sequence just a string of characters? Sequence identity estimation in proteomics and genomics Identity threshold – does it make sense? WHAT IS IMPORTANT IN THE PROTEIN SIMILARITY SEARCH ? WHAT IS IMPORTANT IN THE PROTEIN SIMILARITY SEARCH ? 1) Contribution (%) of identical positions WHAT IS IMPORTANT IN THE PROTEIN SIMILARITY SEARCH ? 1) Contribution (%) of identical positions PKILMECKKD 8 PKILMECKKD 2 1) Contribution (%) identical P K I LIS M KIMPORTANT Cof KH D 8 0 % positions S D PROTEIN C L L D C V C L SIMILARITY 20% WHAT IN THE SEARCH ? similar not similar PKILMECKKD 8 PKILMECKKD 2 PKILMKCKHD 80% SDCLLDCVCL 20% 2) Length of comparedsimilar strings (sequences) 1) the Contribution (%) of identical positions not similar 2) Length of the compared strings (sequences) LCE 1 M V EI C I E P K I R C I K V C T K D E R I T C L I L D ET PKILMECKKD 8 P K I L M E C K K D8 2 2)WCG Length of the compared strings (sequences) 33.3% M V Y WC P R R F M H C V H L K A G G C T PKILMKCKHD 80% SCDWCCLLLRDLCDVYCYL 2260%% casual probably similar similar similar LCE 1 M V EI C I E P K I R C I K V C T K Dnot ER I T C L I L D ET 8 WCG 33.3% M V Y WC P R R F M H C V H L K A G G C T C W C L R L D Y Y 2 6 % 3) Distribution of theofidentical positions along the analyzed 2) Length the compared strings (sequences) casual probably similar sequence 3) MVEMICIEPKIRCIKVCTKDERITL 5 5 I L D ET 8 LCEof1the identical M V EI MVEMIMAGDARCIKVCTKDERITCL C Ialong E P K Ithe R C analyzed I K V C T K sequence DERITCL 3) Distribution positions HVYYWRPERFMHTVKLKAGGCRCWL 20% M V Y WC HHYYWMAGDAHTVQLKAGGCWCWAG 20% WCG 33.3% PRRFMHCVHLKAGGCTCWCL RLDYY 26% casual casual similar probably similar MVEMICIEPKIRCIKVCTKDERITL 5 alongMVEMIMAGDARCIKVCTKDERITCL Distribution of the identical positions the analyzed sequence5 HVYYWRPERFMHTVKLKAGGCRCWL 20% HHYYWMAGDAHTVQLKAGGCWCWAG 20% 4) Residues3)atDistribution conservative ofpositions the identical positions along thesimilar analyzed sequence casual MVCPKILMKCKHDSDCLLDCVCLED MVCPKILMKCKHDSDTLLDCVCLED MVEMICIEPKIRCIKVCTKDERITL 5 MVEMIMAGDARCIKVCTKDERITCL 5 E D E G4) K RResidues R T K R E HatFconservative K E S N L A A A positions FKEQ QNCPGPREWCFTTRMNDSSCACPQT HVYYWRPERFMHTVKLKAGGCRCWL 20% HHYYWMAGDAHTVQLKAGGCWCWAG 20% not similar similar M V C P K I L M K Ccasual KHDSDCLLDCVCLED M V C P K I L M Ksimilar CKHDSDTLLDCVCLED EDEGKRRTKREHFKESNLAAAFKEQ QNCPGPREWCFTTRMNDSSCACPQT 5) Structural/genetic similarity of the amino acids at non-conservative not similar similar positions 4) Residues at conservative positions Identity only EDEGKRRTKREHFKESNLAAAFKEQ QNCPGPREWCFTTRMNDSSCACPQT CRRLVKRCRKETECIVECICIDE notR Lsimilar similar Identity only M V C P K I L M K C K H D S D C L L D CGenetic VCLED Structural M V C P K I L M K C Ksimilarity H D S D C L L D C V amino CLED M V C P K I L M K C K H D Spositions DTLLDCVCLED 5) Structural/genetic M V C P K I L M KofC the K H D S D C Lacids L D C VatC non-conservative LED 2) Length of the compared strings (sequences) PKILMECKKD 8 PKILMECKKD 2 LCE 1 P K I L M K C KMHVDEI 8 C0 I% E P K I R C I KS VD CC TL KL DD EC RV IC TL C L2 I0 L% D ET 8 WCG 33.3% similar M V Y WC P R R F M H C V H L Knot AG GCTCWCLRLDYY 26% similar casual probably similar WHAT IS IMPORTANT IN THE PROTEIN SIMILARITY SEARCH ? 2) of theofcompared strings (sequences) 3) Length Distribution the identical positions along the analyzed sequence 4) LCE 1 M V EI C I E P K I R C I K V C T K D E R I T C L I L D ET 8 MVEMICIEPKIRCIKVCTKDERITL 5 MVEMIMAGDARCIKVCTKDERITCL 5 WCG 33.3% M V Y WC P R R F M H C V H L K A G G C T C W C L R L D Y Y 2 6 % 20% HHYYWMAGDAHTVQLKAGGCWCWAG 20% Residues HVYYWRPERFMHTVKLKAGGCRCWL at thecasual conservative positions probably similar casual similar 3) Distribution of the identical positions along the analyzed sequence 4) Residues at conservative positions MVEMICIEPKIRCIKVCTKDERITL 5C L E D MVCPKILMKCKHDSDCLLDCV HVYYWRPERFMHTVKLKAGGCRCWL 20% EDEGKRRTKREHFKESNLAAA FKEQ casual not similar MVEMIMAGDARCIKVCTKDERITCL M V C P K I L M K C K H D S D T L L D C5V C L E D HHYYWMAGDAHTVQLKAGGCWCWAG Q N C P G P R E W C F T T R M N D S S C20% ACPQT similarsimilar 4) at conservative positions 5) Residues Structural/genetic similarity of the amino acids at non-conservative positions 5) MVCPKILMKCKHDSDCLLDCVCLED MVCPKILMKCKHDSDTLLDCVCLED Identity only EDEGKRRTKREHFKESNLAAAFKEQ QNCPGPREWCFTTRMNDSSCACPQT MVCPKILMKCKHDSDCLLDCVCLED similar similar Structural/geneticnot similarity the positions R of LCR R L Vamino K R C R K Eacids T E C I Vat E Cnon-conservative ICIDE 5) Structural/genetic similarity of the amino acids at non-conservative positions Structural Genetic MVCPKILMKCKHDSDCLLDCVCLED MVCPKILMKCKHDSDCLLDCVCLED R L C R R L V K R C R K E T E C I V E C I C I D Identity E Ronly LCRRLVKRCRKETECIVECICIDE MVCPKILMKCKHDSDCLLDCVCLED RLCRRLVKRCRKETECIVECICIDE Structural MVCPKILMKCKHDSDCLLDCVCLED RLCRRLVKRCRKETECIVECICIDE Genetic MVCPKILMKCKHDSDCLLDCVCLED RLCRRLVKRCRKETECIVECICIDE Sequence multiple alignment Problem of gap manipulation Any protein can be aligned with each other as homologous/similar anybiologicalstring anybilogicalstrip anyprotein canbealigned anybiologicalstri-ng anybi-logicalstrip -an-yprote--i-n canb-----ealigned Statistical approaches vs. accuracy How far may they be improved? Protein secondary structure prediction – accuracy 70-72% (not much changed since 1978) 100% accuracy requires the complete database for all possible structures. For 30 AA polypeptides – 2030 sequences/secondary structures Searching the database for appropriate sequence/structure with the rate 1012 sequences/sec. would proceed 1.8 bilion times longer than the age of the Universe. Genetic conditioning of the amino acid replacement probabilities and spectrum in molecular evolution The Markov model assumes that the substitution probability of amino acid AA1 by AA2 is the same, regardless of what the initial residue AA1 was transformed from (AAx, AAy) AAx AAy AA1 AA1 Pa Pb AA2 AA2 Pa = Pb The currently used statistical algorithms are based on Markovian model of the amino acid replacement (they directly use stochastic matrices of replacement frequency indices) BLOSUM62 matrix of amino acid replacements A R N D C Q E G H I L K M F P S T W Y V 4 -1 -2 -2 0 -1 -1 0 -2 -1 -1 -1 -1 -2 -1 1 0 -3 -2 0 A 5 0 -2 -3 1 0 -2 0 -3 -2 2 -1 -3 -2 -1 -1 -3 -2 -3 R 6 1 -3 0 0 0 1 -3 -3 0 -2 -3 -2 1 0 -4 -2 -3 N 6 -3 0 2 -1 -1 -3 -4 -1 -3 -3 -1 0 -1 -4 -3 -3 D Why tryptophane is here the most conservative residue? 9 -3 -4 -3 -3 -1 -1 -3 -1 -2 -3 -1 -1 -2 -2 -1 C 5 2 -2 0 -3 -2 1 0 -3 -1 0 -1 -2 -1 -2 Q 5 -2 0 -3 -3 1 -2 -3 -1 0 -1 -3 -2 -2 E 6 -2 -4 -4 -2 -3 -3 -2 0 -2 -2 -3 -3 G 8 -3 -3 -1 -2 -1 -2 -1 -2 -2 2 -3 H 4 2 -3 1 0 -3 -2 -1 -3 -1 3 I 4 -2 2 0 -3 -2 -1 -2 -1 1 L 5 -1 -3 -1 0 -1 -3 -2 -2 K 5 0 -2 -1 -1 -1 -1 1 M 6 -4 -2 -2 1 3 -1 F 7 -1 -1 -4 -3 -2 P 4 1 5 -3 -2 11 -2 -2 2 7 -2 0 -3 -1 4 S T W Y V Replacemant Arg Lys according to the statistical interpretation using stochastical matrix indices Arg PAM250 3 BLOSUM62 2 BLOSUM35 2 BLOSUM45 3 BLOSUM100 3 Lys Arginine-to-lysine mutational replacements Met Arg Lys ATG AGG AAG Gln Leu Arg CTR CGR CAR Lys Arg AAR AGR Ser His Arg CAY CGY AGY Arg CGR Arg Lys AGR AAR Possible one-point-mutational processing of serine with respect to its origin Trp Asn UGG AAU Ser Ser UCG AGU Thr Ala Pro Thr Ile Asn Ser Trp Leu Ser Arg Cys (UAG) Gly Is arginine the same as arginine? Possible codons for arginine: AGA AGG CGA CGG CGC CGT Diagram Diagram of of amino codon acid genetic genetic relationships relationships K AAA E GAA K AAG E GAG N AAC R AGA 1 D GAU T ACA I AUA M AUG I AUC A GCU S UCC P CCU L CUA L UUG L CUC V GUU S UCU L UUA L CUG V GUC I AUU S UCG P CCC V GUG C UGU S UCA P CCG A GCC V GUA C UGC R CGU P CCA T ACU W UGG R CGC A GCG T ACC Y UAU – UGA G GGU A GCA T ACG H CAU R CGG G GGC S AGU 3 Y UAC R CGA G GGG S AGC 2 H CAC G GGA R AGG – UAG Q CAG D GAC N AAU AGCU – UAA Q CAA F UUC L CUU F UUU Genetic relationships between Arg and Met/Gln K Q E K Q E N D N AGCU 1 R D R H – G S R A P T P T A S P L V L L V S L L V I S P V I C S A M C R A I W G T Y R G T Y R S 2 – H G 3 – F L F What part of the codon contains the information about the previous amino acid that occurred at certain position of the protein sequence? At most 2/3 of the entire codon. Ala Val GCG GUG How long is the information about codons of preceeding amino acids stored? The shortest storage period is 3 transitions/transversions Ala Val Met Ile GCG GUG AUG AUA Ser Ser Thr Ser UCC UCU ACU AGU Theoreticaly the longest period is infinite Lys Asn Asp His Gln Glu Asp AAA AAC GAC CAC CAG GAG GAU Tyr His Asn Lys Gln His UAU CAU AAU AAG CAG CAC ... Correlated mutations The phenomenon of several mutations occurring simultaneously and dependent on each other According to the current hypothesis of molecular positive Darwinian selection, correlated mutations are related to the changes occurring in their neighborhood, they reflect the protein-to-protein interaction and they preserve the biological activity and structural properties of the molecule The current explanation of correlated mutations occurrence (example) Trp Val CH2 CH H3C CH3 HN CH3 CH3 Val Ala CH H3C CH3 Trp CH2 H3C CH3 HN CH CH2 Leu H3C CH3 CH CH2 Leu Ala The three types of distribution of correlated positions present in myoglobins The residue location and relative distribution is shown on tertiary structure of human myoglobin (P0244, pdb1bzp) The spot correlation cluster Position no. and occurring residues 127 [AMSTV] 27 [ADEFLNT] 31 78 Correlation versus position 127 A (58) S (7) ADEFNT E [GKRS] GKRS R [AKLQ] K ALQ DEGT E AEHKQS A AEKQS E BDEN D 109 [DEGNT] 116 [AEHKQST] 117 [AEKNQS] 122 [BDEN] The three types of distribution of correlated positions present in Bowman-Birk inhibitor family The residue location and relative distribution is shown on tertiary structure of BowmanBirk inhibitor from soybean (P01055) The narrow correlation cluster Position no. and occurring residues 13 [–ADFIKLMPRSTV] Correlation versus position 13 L (11) M (10) A (8) 4 [–RSTVY] V –S S 5 [–KPST] K –S S 7 [AEGKP] A P P 11 [EFHIKLQRST] T EHQ S 21 [EFIKMQT] T Q EQ The three types of distribution of correlated positions present in eglin-like proteins. The residue location and relative distribution is shown on tertiary structure of eglin C (P01051) The dispersed correlation Position no. and occurring residues 67 [–DGNT] 10 [–ELNQRST] Correlation versus position 67 D (8) G (9) ET LNQRS The three types of distribution of correlated positions present in lysozymes The residue location and relative distribution is shown on tertiary structure of lysozyme from rat (P00697, pdb5lyz) The dispersed correlation Position no. and occurring residues 80 [GHKNR] Correlation versus position 80 G (7) H (31) N (16) 30 [ILMV] MV ILMV V 40 [DFKNR] DN N FKNR The observed number and contribution of three correlation types in four different protein families The correlation sets consist of 2 to over 20 residues The correlation statistics The protein family (number of correlated positions/set) Total number of correlation sets observed Number of dispersed sets Number of narrow clusters Number of undirected clusters Number of sets related to active center 20 7 7 6 1 23 4 13 6 9 Myoglobins (229) 41 23 9 9 n.a. Lysozymes (2-15) 41 25 9 7 9 All families 125 (100%) 59 (47.2%) 38 (30.4%) 28 (22.4%) - Eglin-like proteins (2-13) Bowman-Birk proteinase inhibitors (2-28) A mathematician – biologist dialogue The communication problem Bowls are convex Bowls are concave In entire splendour of natural phenomena... ...not always the first conclusion is correct and the first impression consistent with the reality P01055 P01057 P01056 P01058 P01059 P01063 P17734 P81483 P81484 P16343 P01064 P82469 P01061 P01062 P01060 1BBI: 1D6R:I 1DF9:C 1PI2: 1PBI:A AAB4719 TISYC2 JC2225 TIZB2 JC2073 JC2072 0506164 0401177 763679A TISYD2 0907248 1102213 1102213 0404180 TIZB1B TIMB TIZB1P JC1066 Q41066 P80321 Q41065 P81705 P56679 P16346 P01065 P24661 P07679 P19860 P22737 220645 P09864 P09863 3 10 20 30 40 50 60 ESSKPCCDQCACTKSNPPQCRCSDMRLNSCHSACKSCICALSYPAQCF-CVDITDFCYEP-CKP ESSKPCCDECACTKSIPPQCRCTDVRLNSCHSACSSCVCTFSIPAQCV-CVDMKDFCYAP-CKS QSSKPCCBHCACTKSIPPQCRCTDLRLDSCHSACKSCICTLSIPAQCV-CBBIBDFCYEP-CKS ESSKPCCDQCSCTKSMPPKCRCSDIRLNSCHSACKSCACTYSIPAKCF-CTDINDFCYEP-CKS ESSKPCCDLCTCTKSIPPQCHCNDMRLNSCHSACKSCICALSEPAQCF-CVDTTDFCYKS-CHN ESSKPCCDLCMCTASMPPQCHCADIRLNSCHSACDRCACTRSMPGQCR-CLDTTDFCYKP-CKS QSSKPCCRQCACTKSIPPQCRCSQVRLNSCHSACKSCACTFSIPAQCF-CGBIBBFCYKP-CKS -SSKPCCBHCACTKSIPPQCRCSBLRLNSCHSECKGCICTFSIPAQCI-CTDTNNFCYEP-CKS -SSKPCCBHCACTKSIPPQCRCSBLRLNSCHSECKGCICTFSIPAQCI-CTDTNNFCYEP-CKS ESSKPCCSSC-CTRSRPPQCQCTDVRLNSCHSACKSCMCTFSDPGMCS-CLDVTDFCYKP-CKS EYSKPCCDLCMCTRSMPPQCSCEDIRLNSCHSDCKSCMCTRSQPGQCR-CLDTNDFCYKP-CKS -SSGPCCDRCRCTKSEPPQCQCQDVRLNSCHSACEACVCSHSMPGLCS-CLDITHFCHEP-CKS ESSHPCCDLCLCTKSIPPQCQCADIRLDSCHSACKSCMCTRSMPGQCR-CLDTHDFCHKP-CKS ESSEPCCDSCDCTKSIPPECHCANIRLNSCHSACKSCICTRSMPGKCR-CLDTDDFCYKP-CES QSSPPCCBICVCTASIPPQCVCTBIRLBSCHSACKSCMCTRSMPGKCR-CLBTTBYCYKS-CKS ESSKPCCDQCACTKSNPPQCRCSDMRLNSCHSACKSCICALSYPAQCF-CVDITDFCYEP-CKP ---KPCCDQCACTKSNPPQCRCSDMRLNSCHSACKSCICALSYPAQCF-CVDITDFCYEP-CKESSEPCCDSCDCTKSIPPQCHCANIRLNSCHSACKSCICTRSMPGKCR-CLDTDDFCYKP-CES EYSKPCCDLCMCTRSMPPQCSCED-RINSCHSDCKSCMCTRSQPGQCR-CLDTNDFCYKP-CKS DVKSACCDTCLCTKSNPPTCRCVDVGET-CHSACLSCICAYSNPPKCQ-CFDTQKFCYKQ-CHN ESSKPCCDQCTCTKSIPPQCRCTDVRLNSCHSACSSCVCTFSIPAQCV-CVDMKDFCYAP-CKS ESSKPCCDLCMCTASMPPQCHCADIRLNSCHSACDRCACTRSMPGQCR-CLDTTDFCYKP-CKS ESSKPCCDLCMCTASMPPQCHCADIRLNSCHSACDRCACTRSMPGQCR-CLDTTDFCYKP-CKS ESSKPCCDQC-CTKSMPPKCRCSDIRLDSCHSACKSCACTYSIPAKCF-CTDINDFCYEP-CKS ESSKPCCDECKCTKSEPPQCQCVDTRLESCHSACKLCLCALSFPAKCR-CVDTTDFCYKP-CKS ESSKPCCDECKCTKSEPPQCQCVDTRLESCHSACKLCLCALSFPAKCR-CVDTTDFCYKP-CKS ESSKPCCDQC-CTKSMPPKCRCSDIRLDSCHSACKSCACTYSIPAKCF-CTDINDFCYEP-CKS ESSKPCCDLCMCTASMPPQCHCADIRLNSCHSACDRCACTRSMPGQCR-CLDTTDFCYKP-CKS ESSKPCCDLCMCTASMPPQCHCADIRLNSCHSACDRCACTRSMPGQCR-CLDTTDFCYKP-CKS EYSKPCCDLCMCTRSMPPQCSCEDIRLNSCHSDCKSCMCTRSQPGQCR-CLDTNDFCYKP-CKS ESSEPCCDSCRCTKSIPPQCHCADIRLNSCHSACKSCMCTRSMPGKCR-CLDTDDFCYKP-CES ESSEPCCDLCLCTKSIPPQCQCADIRLNSCHSACKSCMCTRSMPGQCH-CLDTHDFCHKP-CKS ESSEPCCDLCLCTKSIPPQCQCADIRLNSCHSACKSCMCTRSMPGQCR-CLDTHDFCHKP-CKS EYSKPCCDLCMCTRSMPPQCSCEDIRLNSCHSDCKSCMCTRSQPGQCR-CLDTNDFCYKP-CKS ESSHPCCDLCLCTKSIPPQCQCADIRLDSCHSACKSCMCTRSMPGQCH-CLDTHDFCHKP-CKS ESSEPCCDSCDCTKSKPPQCHCANIRLNSCHSACKSCICTRSMPGKCR-CLDTDDFCYKP-CES ESSHPCCDLCLCTKSIPPQCQCADIRLNSCHSACKSCMCTRSMPGQCR-CLDTHDFCHKP-CKS ESSEPCCDSCDCTKSKPPQCHCANIRLNSCHSACKSCICTRSMPGKCR-CLDTDDFCTKP-CES DVKSACCDTCLCTKSDPPTCRCVDVGET-CHSACDSCICALSYPPQCQ-CFDTHKFCYKA-CHN STTTACCDFCPCTRSIPPQCQCTDVREK-CHSACKSCLCTLSIPPQCH-CYDITDFCYPS-CRDVKSACCDTCLCTKSNPPTCRCVDVRET-CHSACDSCICAYSNPPKCQ-CFDTHKFCYKA-CHN --TSACCDKCFCTKSNPPICQCRDVGET-CHSACKFCICALSYPAQCH-CLDQNTFCYDK-CDS DVKSACCDTCLCTKSNPPTCRCVDVGET-CHSACLSCICAYSNPPKCQ-CFDTQKFCYKA-CHN --TTACCNFCPCTRSIPPQCRCTDIGET-CHSACKTCLCTKSIPPQCH-CADITNFCYPK-CNDVKSACCDTCLCTRSQPPTCRCVDVGER-CHSACNHCVCNYSNPPQCQ-CFDTHKFCYKA-CHS DVKSACCDTCLCTKSEPPTCRCVDVGER-CHSACNSCVCRYSNPPKCQ-CFDTHKFCYKS-CHN KRPWECCDIAMCTRSIPPICRCVDKVDR-CSDACKDCEETEDN--RHV-CFDTYIGDPGPTCHD ERPWKCCDLQTCTKSIPAFCRCRDLLEQ-CSDACKECGKVRDSDPPRYICQDVYRGIPAPMCHE ERPWKCCDLQTCTKSIPAFCRCRDLLEQ-CSDACKECGKVRDSDPPRYICQDVYRGIPAPMCHE ES-EGCCDRCICTKSMPPQCHCHDVRLDSCHSDCETCICTRSYPAQCR-CADTTDFCYKP-C-S TRPWKCCDRAICTKSFPPMCRCMDMVEQ-CAATCKKCGPATSDSSRRV-CEDXY----------KRPWKCCDQAVCTRSIPPICRCMDQVFE-CPSTCKACGPSVGDPSRRV-CQDQYV---------- Thank you for your attention !!!