Download Two-overlapping-circles model

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
The Model of Two Overlapping Circles.
The purpose of this model is to estimate the distribution of the areas of contact between
residues at protein-protein interfaces.
Let us consider a simple task of estimation of intersection area of two circles placed one
upon another. We suppose, that these circles have an identical radius r.
s
A
B
O
Figure 1. AOBs is a sector of the circle, s is an arch line of this sector, and AB is its span.
The area of each circle is πr2, so intersection area value lies within [0, πr2]. Let’s find the
dependence of intersection area value S from the distance L between centers of the circles. It is
obvious, that if L≥2r, than S=0. We need to calculate the area of sector AOBs and the area of
triangle AOB to calculate the overall area of segment, that is bordered by the arch line of this
sector s and the span AB (Fig.1). The intersection area of these circles will be:
L r2 
L2
4 ),
L2
(1)
)
2
4r 2
here L is the distance between the centers of these circles. This statement is true if L  0,2r .
Casual contacts. In the case of casual contacts the problem of contact formation can be
reduced to the problem of two identical circles of radius r intersecting by chance in a square
region with R being the length of its side.
Coordinates of centers of these circles are x1, y1 and x2, y2, accordingly. Thus the distance
S ( L)  2( r 2 arcsin( 1 
between the centers of these circles L  ( x1  x2 ) 2  ( y1  y 2 ) 2 . If we place the square region
in the grid origin (Fig.2), then coordinates of the centers of the circles can adopt values from r to
R-r.
y
R
R-r
r
0
r
R-r R
x
Figure 2. Square region of circles throwing. Its central square zone is the zone of possible coordinates of the
circles’ centers.
Let’s find the probability of a zero value of the intersection area S. If we express it in
mathematical language, then in that case the distance between centers of circles should be more
or equal than the sum of their radii:
L  2r .
(2)
It is well known, that such probability is a ratio of the area of space that meets the
condition (2) to the overall area of space that can adopt coordinates of the centers of the circles.
The area of the part of space that can include coordinates of the centers of circles is (R-2r)4.
Let us make the following substitution^
x  x2
x  x2
u1  1
;
u2  1
;
2
2
y  y2
y  y2
v1  1
;
v2  1
;
2
2
Then transition matrix is:
 u1 
 1  1 0 0  x1 
 

 
 u 2  1  1 1 0 0  x 2 
 v   2  0 0 1  1  y 
 1 

 1 
 0 0 1 1  y 2 
 v2 
Inequality (2) now can be rewritten as:
(3),
u12  v12  2r 2 ,
 R  2r R  2r
here u and v, can adopt values from the interval [
;
] (Fig.3).
2
2
v
u
Figure 3. The rhomb area is the zone of possible coordinates of circles’ centers. The central circle zone
corresponds to the inequality (3).
It is easy to see, that the inequality (3) corresponds to the space outside the circle with radius
2r and inside the square with the side length R-2r, so the probability Р of the zero intersection
area of two identical circles is:
R  2r 2  2r 2 R  2r 2  R  2r 2  2r 2  1  2r 2 .
P
(4)
R  2r 4
R  2r 2
R  2r 2
The formula
L2
P
(5)
2
2R  2 r 
gives the probability that the distance between the centers of the circles does not exceed 2r
( L  0,2r), and, therefore, the intersection area is greater than zero (see (2)).




dP
on the intersection area S we can rewrite the
dS
corresponding equation in parametric form, because we cannot express L as a function of S in an
explicit form:
To find a dependence of the probability density
dS
 2(
dL
r2
1  (1 
2
L
)
4r 2
1
2
1
1
2
L
4r 2
(
1
)2 L  (
4r 2
L2
1 1
L (  )2 L
L2
4  2 4
))  2 r 2 
2
4
L2
2
2 r 
4
r2 
dP
L

dL
( R  2r ) 2
L
1
L
 dP dP dL
)
 dS  dL dS  (  ( R  2r ) 2 )( 
L2
L2

2 r2 
r ( R  2r ) 2 1  2
(6)
4
4r


2
2
S ( L)  2( r 2 arcsin( 1  L )  Lr 1  L )

4r 2
2
4r 2
One can see a plot of the dependence (6) in Figure 4 (curve A). As the contact area
increases, the number of contacts decreases rapidly. Consequently, the average of the distribution
is close to zero. In other words, the distribution for the casual contacts contains a lot of smallarea contacts (Fig.1, curve A).
Specific contacts. It is reasonable to assume that specific contacts have a well-defined
nonzero contact area, so some average specific contact area exists that originates from some
specific (physicochemical) interactions of amino acid residues. Let us consider such specific
interactions as a tendency to form the maximal contact area. In this case, the centers of circles
have a tendency to coincide with each other, and the problem is equivalent to a problem of
shooting at a target in the statistical sense. The distribution of distances between points and the
center of the target in this case is a normal one. Thus, in the model of specific contacts, the
distance between centers of circles follows the formula for normal distribution, also in
parametric form:
( La )2

1
2
e 2
 f ( L) 

 2
(7)

2
2

L
Lr
L
2
S ( L)  2( r arcsin( 1  4r 2 )  2 1  4r 2 )

In a general case, formulas (6) and (7) enter in a sum function with some coefficients that
reflect the ratio of casual and specific contacts (Fig. 4).
0.8
Model distribution.
0.7
0.6
0.5
0.4
B
0.3
C
0.2
0.1
A
30
25
20
15
10
5
0
0
Model area of interresidue contact, Å2
Figure 4. Plots of equation systems (6), (7), modeling the distribution of contact area for casual and specific
(responsible for protein recognition and binding) contacts. (A) The plot of equation system (6) in parametric
form. This plot reflects the distribution of areas of stochastic contacts. (B) The plot of equation system (7) in
parametric form. This plot reflects the distribution of areas of specific contacts. (C) The plot of the sum of
equation systems (6) and (7).
The area under the sum curve is equal to 1. Thus, the contact area distribution can be represented
as a composite one, one part of which is formed by casual contacts and the other by specific
contacts (responsible for protein recognition and binding). Casual contact distribution reflects the
fact that there is a large number of very small contacts (with nearly zero area), and the number of
contacts rapidly decreases with the increase of the contact area. The distribution of specific
contacts, on the other hand, has some average nonzero contact area, and is dome-shaped.