Download mcsis - CMBI

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Nuclear receptor ligand-binding domains,
looked at from all directions.
Nuclear receptor function
Nuclear receptor family
NR2A2-HN4G
NR2B3-RRXG
NR2A5-HN4 d?
NR2B1-RRXA
NR2B2-RRXB
NR3C1-GCR
NR2A1-HNF4
NR3C4-ANDR
NR2C2-TR4
NR2C1-TR2-11
NR2E1-TLX
NR0B1-DAX1
NR0B2-SHP
NR2E3-PNR
NR3A1-ESTR
NR3C2-MCR
NR3A2-ERBT
NR3B1-ERR1
NR6A1-GCNF
NR2F6-EAR2
NR3B2-ERR2
NR5A1-SF1
NR5A2-FTF
NR2F2-ARP1
NR2F1-COTF
NR3C3-PRGR
NR4A1-NGFI
NR4A3-NOR1
NR1C1-PPAR
NR4A2-NOT
NR1C2-PPAS
NR1H4-FAR
NR1C3-PPAT
NR1H3-LXR
NR1D1-EAR1
NR1D2-BD73
NR1I1-VDR
NR1F3-RORG
NR1A2-THB1
NR1F1-ROR1
NR1I2-PXR
NR1A1-THA1
NR1F2-RORB
NR1B3-RRG1
NR1B2-RRB2 NR1B1-RRA1
NR1I4-CAR1-MOUSE-
NR1H2-NER
NR1I3-MB67
Nuclear receptor structure
A-B
AF-1
C
C D
DNA
E
LBD
DNA binding domain
– highly conserved
– > 90% similarity
E
Ligand binding domain
– conserved protein fold
– > 20% sequence similarity
F
The questions
As Organon is paying the bills, question one is,
of course☺, how do ligands relate to activity?
NRs can bind co-activators and co-repressors, with or
without ligand being present, so what are agonists,
antagonists, and inverse agonists?
What is the role of each amino acid in the NR LBD?
Which data handling is needed to answer these questions?
3D structure LBD
(hER)
Available NR data
56 structures in (PDB)
>500 sequences (scattered)
>1000 mutations (very scattered)
>10000 ligand-binding studies (secret)
Disease patterns, expression, >1000 SNPs, genetic
localization, etc., etc., etc.
This data must be integrated, sorted, combined,
validated, understood, and used to answer our
questions.
Step 1
The first important step is a common numbering
scheme.
Whoever solves that problem once and for all should
get three Nobel prices.
Large data volumes
Large data volumes allow us to develop new data
analysis techniques.
Entropy-variability analysis is a novel technique to look
at very large multiple sequence alignments.
Entropy-variability analysis requires ‘better’ alignments
than routinely are obtained with ‘standard’ multiple
sequence alignment programs.
Structure-based alignment
Entropy
Sequence entropy Ei at position i is calculated
from the frequency pi of the twenty amino acid
types (p) at position i.
Example:
20
Ei =
-
S
i=1
pi ln(pi)
12345678
ASDFGHKL
ASEFNHKL
ASDYGHRL
ASDFSHKL
ASEYDHHI
ATEYPHKL
Entropy at 1 is zero because
0*ln(0)=0 and 1*ln(1)=0 are zero
Entropy at 2 is .84*ln(.84) +
.16*ln(.16) ~ .73
Entropy at 3 is 2*.5*ln(.5) ~ .69
Entropy at 5 is .32*ln(.32) +
4*.16*ln(.16) ~ 1.5
20* .05*ln(.05) ~ 3.0
Variability
Sequence variability Vi is the number of
amino acid types observed at position i in
more than 0.5% of all sequences.
Rules
1) If a residue is conserved,
it is important
2) If a residue is very conserved,
it is very important
And with 1000 sequences:
Ras Entropy-Variability
11 Red
12 Orange
22 Yellow
23 Green
33 Blue
Protease Entropy-Variability
11 Red
12 Orange
22 Yellow
23 Green
33 Blue
Globin Entropy-Variability
11 Red
12 Orange
22 Yellow
23 Green
33 Blue
GPCR Entropy-Variability
GPCR
11 G protein
12 Support
22 Signaling
23 Ligand in
33 Ligand out
NR LBD Entropy-Variability
11 main function
2.8
12 first shell around
main function
2.4
22 core residues
(signal transduction)
2.0
23 modulator
E
N 1.6
T
R
O
1.2
P
Y
33
23
33 mainly surface
0.8
22
12
0.4
11
0.0
0
2
4
6
8
10
VARIABILITY
12
14
16
18
Mutation data
1095 entries
41 receptors
12 species
3D numbers
7 sources
http://www.cmbi.kun.nl/NR
and click at NRMD
Mutation data
Transcription
Diseases
20%
60%
50%
15%
40%
10%
30%
20%
5%
10%
0%
0%
Box 11
Box 12
Box 22
Box 23
Box 11
Box 33
Coregulator
Box 12
Box 22
Box 23
Box 33
Dimerization
40%
40%
30%
30%
20%
20%
10%
10%
0%
0%
Box 11
Box 12
Box 22
Box 23
Box 33
Box 11
Box 12
Box 22
Box 23
Box 33
Mutation data
No effect
Ligand binding
6%
30%
5%
4%
20%
3%
10%
2%
0%
0%
1%
Box 11
Box 12
Box 22
Box 23
Box 33
Box 23
Box 33
No mutations
25%
20%
15%
10%
5%
0%
Box 11
Box 12
Box 22
Box 11
Box 12
Box 22
Box 23
Box 33
Ligand binding data
Ligand-binding positions extracted from PDB files (nomenclature)
Categorized in very frequent to not so frequent binder
Which type of ligand it binds (agonist/antagonist=inverse agonist…)
Ligand-binding residues
LIG 1 more than 50 of 56
LIG 2 25-50 of 56
LIG 3 11-24 of 56
LIG 4 1-10 out of 56
H-bonds (~35,15,15,15)
Example: role of Asp 351
agonist
antagonist
Ligand, cofactor and dimerization data
combined with entropy-variability analysis
Ligand contacting residues
Cofactor contacting residues
12
3.5
10
3
2.5
8
2
6
1.5
4
1
2
0.5
0
0
Box 11
Box 12
Box 22
Box 23
Box 33
Residues involved in dimerization
7
6
5
4
3
2
1
0
Box 11
Box 12
Box 22
Box 23
Box 33
Box 11
Box 12
Box 22
Box 23
Box 33
Conclusions:
Data is difficult, but we need it (sic); life would be so
nice if we could do without. PDB files are the worst.
Nomenclature is not homogeneous.
Much data has been carefully hidden in the literature
where it can only be found back with great difficulty.
Residue numbering is difficult but very necessary.
Variability-entropy analysis is powerful, but requires
very 'good' alignments.
Acknowledgements:
Organon
Jacob de Vlieg
Jan Klomp
Paula van Noort
Scott Lusher
UCSF
Florence Horn
CMBI
Emmanuel Bettler
Simon Folkertsma
Henk-Jan Joosten
Joost van Durme
Wilco Fleuren
Jeroen Eitjes
Jeroen van Broekhuizen
Richard Notebaart
Richard van Hameren
Ralph Brandt
Related documents