Download 158 Miscellanea The closest pair of N random points on

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Randomness wikipedia , lookup

Birthday problem wikipedia , lookup

Transcript
158
Miscellanea
Biometrika (1979), 66, 1, pp. 158-62
Printed in Great Britain
The closest pair of N random points on the surface of a sphere
BY P. A. P. MORAN
Department of Statistics, Australian National University, Canberra
SUMMARY
Some key words: Geometries probability; Nearest neighbour; Pointe on a sphere.
1. INTRODUCTION
Consider N random points on the surface of a sphere of unit surface area, and represent
them by vectors x1,...,xN, of length (47r)~*, which we suppose to be independently and
isotropically distributed. Define dfa^Xj) to be the angle between xt and Xy In a letter,
Dr Alan Perelson, University of California Los Alamos Scientific Laboratory, has raised
the question of determining PN{6), the probability distribution of the smallest of the %N(N — 1)
angles d{xitXj). This problem is of some interest in studying the way in which antibodies
attack red blood cells.
Clearly no exact answer can be erpected; for if we had an exact answer for each N and 8
we would be able to determine the value of 6 at which PN(6) becomes equal to unity. The
knowledge of this as a function of N is equivalent to the classical unsolved packing problem
of determining the largest number of spherical caps of semiangle \d which can be placed on
the surface of a sphere without overlapping. For references and bounds for this function,
see Rankin (1955). The purpose of the present paper is to obtain fairly close bounds for
PN(6), and to study its limiting behaviour for N large and 6 small. Note that the analogous
problem for points independently and uniformly distributed on a circle is well known and
easy to solve exactly.
2. UPPER AND LOWER BOUNDS FOR
Write Cf for the spherical cap of centre xi and semiangle 6, and a for its area $(1 — cosfl).
THEOREM
1. If(N-l)cc^l,
then
PN(9)$MN(d) = l - ( l - « ) ( l - 2 « ) . . . { l - ( t f - l ) « } .
(1)
Proof. Suppose that the points are put on the sphere sequentially and consider the
probability that their mutual distances are all greater than 6. If n — 1 points have been put
on the sphere, the probability that xn will be distant in angle at least d from all of them will
be 1 — | An-11, where An_x is the set union of all the caps Clt..., Cn_1( and | An_x | is its surface
measure. A lower bound for 1 — |^4n_1| is {1 — (n— l)a}, so that arguing by induction,
l-PN(6)>(l-«)(l-2a)...{l-(N-l)«},
which proves (1).
THEOREM 2. / /
(N - 1 ) a s? § - V3/(2rr) = 0-391002,
(2)
Downloaded from http://biomet.oxfordjournals.org/ at Penn State University (Paterno Lib) on May 17, 2016
Consider N points distributed independently and uniformly on the surface of a sphere.
Exact upper and lower bounds are found for the distribution of the smallest angular distance
between pairs of the points. Similar methods can be used for analogous problems. The present
problem arose in the study of the way in which antibodies attack red blood cells.
Miscellanea
159
write
j8 = { l - f + V3/(27r)}/(l-(tf-l)a)<l.
(3)
Then
PN(6)>mN(9)= 2(l- a )(l-2a)...{l-(n-2) a }^{l-(l-/3a)»- 1 }.
(4)
n-2
Here, and in all subsequent similar formulae, we make the convention that the continued
product which occurs as one factor in each of these terms, is equal to unity when n = 2.
To prove (4) we need a preliminary lemma.
2?( 14 B | )>/3-i{l-(l-/}<*)»}.
(5)
Proof. We prove this by induction. Write E for the expectation under the restriction
stated above, suppose that the result is true for n, and write Ex for the expectation with
respect to the distribution of xn+1 conditional on d{xi,xn+1)~^6, for t = 1, ...,n and fixed
xlt..., xn. Let f(xn+1) be the measure of that part of the cap Cn+1, if any, which is not in An.
If ay is the measure of the region common to two caps whose centres are distance 9 apart,
f(xn+i) ^ ^ D e n ° t greater than a(l — y) if xn+1 lies in An. The average value of f(xn+1) when
xn+1 is uniformly distributed over the sphere without any restriction is clearly a(l — |.4 ft |).
Writing Bn for the part of the surface of the sphere outside An, we now have
-\An\) = \An\l\An\-*\
f(xn+1)dxn+1 } + \BH\ \\Bn\-i\
fix^dx^X
(6)
The second term on the right-hand side is | 5 n | ^ 1 ( | J 4 n + 1 | —|^4B|), where the expectation is
taken conditional on the fixed values of xlt...,xn.
The first term is not greater than
a| An| (1 - y ) . We therefore have
(7)
Now consider y. We show that y > § — ^3/(2TT). TO do this consider two spherical caps of
semiangle 8 ^ far whose centres are 6 apart, measured in angular distance. Then use of a
little spherical trigonometry shows that the surface area of the region common to both is
It can be shown numerically that the ratio of this to the area, $(1 — cos 6), of a spherical cap,
increases as 6 increases from zero to fn-. At 6 = \n it has the value \ whilst as 0 tends to
zero it tends to the ratio of the common part of two unit circles whose centres are at unit
distance apart to the area of one of them. This is easily found to be equal to
$-V3/(2TT)
= 0-391002.
From (6) and the inequality \An\^na, then, if (N-l)a^y,
than unity, and ( l - y ) / ( l - | ^ n | ) < j 3 < 1. Then
ft as defined by (2) is less
)
(9)
Downloaded from http://biomet.oxfordjournals.org/ at Penn State University (Paterno Lib) on May 17, 2016
LEMMA. Suppose that a^,..., xn represent n points which are placed uniformly randomly
and independently on the surface of a sphere subject only to the restriction that all their mutual
angular distances are not less than 6. Let An, as before, be the set union of the caps Olt...,Cn.
Then, provided that na^ $ — ^3/(2TT), the conditional expectation
160
Miscellanea
and averaging over xv...,xn subject to the restriction that d{xi,xj)>6, for
j
we have that E{\An+1\)^a + (l-a^)E(\An\).
Using this as a recurrence relation and
remembering that \A1\ = a,we obtain (5).
Proof of Theorem 2. Suppose that the points Pv ...,PN are put on the sphere sequentially
and define the event Sn (n = 2, ...,N) to occur if there is an t satisfying l < t <n— 1 such that
d(xt, £„)<#, but that d\xit xf) > 8 for 1 < t < j < n — 1. Then PN{0) ia the sum of the probabilities
of the exclusive events €n. Consider a lower bound for the probability of the event Sn. The
probability that the points xv ...,xn_1 are all at least 6 apart in angular distance is, by
Theorem 1, not less than (1— a) (1 — 2a)... {l-(n-2)a}. Then the probability that xn,
uniformly distributed, will hit An_x is, by (5), not less than jS-^l - (1 -jSa)"-1}. Combining
these inequalities and adding we get (4).
S(l-a)(l-2a)...{l-(n-2)a}{l-(l-a)»-1}<m^).
n-2
On the other hand, MN(6) can be written as 2(1 - a) (1 - 2a)... {1 - (n - 2) a} (n -1) a, where
the sum is over n = 2,.... N, and thus
MN(6)-mN(0)^?:(l-«)...{l-(n-2)a}{(n-l)«-l
+ (l-!x)»-1}.
n-2
We also have ( n - l ) a - l + (l-a) n - 1 <i(»-l)(n-2)a*. It follows that as Na-+0,
In (1), suppose that Nd-+x, a finite nonzero constant. Then Na = |2V(1 -cos0)->O and
(10)
and this is the asymptotic form of PN(d). This agrees with the asymptotic results obtained
by Efron (1967) and Miles (1970) for random points in finite parts of Euclidean space. Note
that the right-hand side of (10) is a Weibull distribution.
Using (10) and writing d^n for the smallest of the distances d(xit xf), we can easily prove
that Ndmto has asymptotically a mean equal to -J{2TT).
4. NUMERICAL EXAMPLES
It is not difficult to calculate the upper and lower bounds (1) and (4) on a desk computer.
As examples consider the cases N = 10, 6 = 10°, 20°, 30°, and N = 1000, 6 = 0-1°, 0-2°, 0-3°
and 0-4°. Each pair of bounds for N = 1000 took about 3J minutes on a Hewlett-Packard
9820A. The results are given in Table 1. The condition (2) breaks down for N = 10 between
6 = 20° and 6 = 30° and the result (4) no longer holds. For larger N the bounds become very
close in the region of interest, as is shown by thefiguresfor N = 1000.
Table 1. Calculation of M^O), m^d) forN= 10,1000
N = 10
e
{N-l)a
10° 0-068365
20° 0-271383
30° 0-602886
mA.0)
MAO)
0
0-291827
0-736421
.
0-295564
0-778851
0-981599
0-1°
0-2°
0-3°
0-4°
N = 1000
(N—\)<x mAO)
0-000761 0-316431
0-003043 0-781563
0-006847 0-966733
0-012172 0-996451
MAW
0-316472
0-781967
0-967657
0-997782
Downloaded from http://biomet.oxfordjournals.org/ at Penn State University (Paterno Lib) on May 17, 2016
3. LIMITING BEHAViotm OF Pf^O)
Next we observe that y3—X{1 - (1—/fa)"-1} decreases as /? increases from zero to unity. We
can therefore replace the lower bound mN(6) in (4) by
Miscellanea
161
To modify (4) to take account of the breakdown in (2) suppose that we still have (N - 2) a < 1
and argue as follows. Let n0 be the largest value of n such that 7ia<|-^3/(2ir). Then,
following through the proof of the lemma, we still have E(\An\)^ fi-^l - (I - {3a)nt}, and
following the proof of Theorem 2 we have, for (N—2) a < 1, that
{
n-i
(11)
6. MlSCEXiLAlTEOTrS KEMABKS
(12)
cos6)*.
(13)
These inequalities can be strengthened to include more terms of the standard inclusionexclusion inequalities, but only at the expense of evaluating awkward integrals.
The inequalities (12) and (13) and higher inequalities of the same type provide quite good
bounds when Pff(9) is less than about £, but above this they are unsatisfactory. The main
trouble is that as N-*-oo, 6->- 0, N8-*-x, the upper and lower bounds obtained by this method
do not converge to each other but converge to the successive finite term approximations
to 1 — exp (—fax*)which bracket the true value.
Methods of the type used in the present paper can also be used to estimate the distribution
of the distance between the closest pair of N random points in higher dimensions and in
other subsets of Euclidean space. As an example consider N random points in a square S
of unit area. Then if the whole pattern is taken as doubly periodic in the plane, or what is
equivalent, wrapped around a torus, Theorems 1 and 2 remain true if 8 is taken as the
Euclidean distance between pairs of points and a is redefined to equal nd2.
If the pattern is not denned to be periodic we have a problem involving boundary effects.
Let An be the sum set of the circles Clt...,Cn and let A'n be the common part of An and the
square S. Then | A'n | < | An | < na, and the inequality (1) remains true. An inequality for
1 — | An-11 can also be established but would be of rather an awkward form and it is scarcely
worth setting down the details here.
The methods of the present paper could also be used to consider the clustering of points
on rectangular or other types of lattices. As an example consider a rectangular lattice of
ixffl points and suppose that N of these are randomly chosen. Call these 'black' and the
others 'white'. We may ask for the probability, pN say, that at least two black points are
adjacent, either horizontally or vertically. Then we can argue exactly analogously to the
proofs of Theorems 1 and 2. For simplicity we suppose the whole lattice is repeated
periodically, or what is the same, wrapped around a torus. With each black point associate
the four nearest neighbours and call the resulting set of five points a 'figure'. This corresponds to a cap in Theorems 1 and 2. Write a = 5/(lm) and denote the N black points by
Downloaded from http://biomet.oxfordjournals.org/ at Penn State University (Paterno Lib) on May 17, 2016
Another approach to finding bounds on PN(6) is to use an inclusion-exclusion argument.
Given N random points represented by xlt • •-,xN, independently and uniformly distributed
on the surface of the sphere, define $N(N— 1) random variables Aif
(i,j=l,...,N;i+j)
such that Ai} = 1 if d(x{, xf) < 6, and equals zero otherwise. Then a standard result is that
lEA'LE)^PN(e)^'LE(Aii).
It is easy to see that E{Ai}) = a = £(l-cos0) and
— cos #)2 if Ati and Aa are different; this includes cases where one suffix
is repeated. It follows that
162
Miscellanea
Plt ...,PN. Then if 5(N -1) < lm, we can prove that
pN^l-(l-cc)(l-2a)...{l-(N-l)cc},
(14)
pN> S(l-«)(l-2 a )...{l-(n-2) a }i3-i{l-(l-i3 a )»- 1 },
(15)
and if 5(^-1)<\lm,
n-2
The first term on the right-hand side is not greater than 31 An \ whilst the second term is
If 5(n— 1) <§Jm we have \An\ <f and we put
3/5
3/5
Then ^ 1 (| J 4 B + 1 |)^|^4. l l | + a ( l - ^ | ^ 4 n | ) , where a = 5/(/m). Arguing as before we get, since
\A-y\ = a, ^ ( | ^ n | ) ^ ^ 3 - 1 { l - ( l - ^ a ) n } ) which is certainly no less than l - ( l - a ) n . The rest
of the argument then follows as before.
REFERENCES
EFBON, B. (1967). The problem of the two nearest neighbours (Abstract). Arm. Math. Statist. 38, 298.
MXLES, R. E. (1970). On the homogeneous planar Poisson point process. Math. Biosci. 6, 86-127.
RAUXIN, R. A. (1955). The closest packing of spherical caps in n dimensions. Proc. Glasgow Math. Aesoc.
2, 139-44.
[Received June 1978. Revised September 1978]
Downloaded from http://biomet.oxfordjournals.org/ at Penn State University (Paterno Lib) on May 17, 2016
where 0 = £{1 - 5(N- lj/^m)}" 1 .
Let An be the set of all points Pv ...,Pn and their nearest neighbours, i.e. the set sum of
the first n figures. Let \An\ be (lm)-1 times the number of distinct points in An. Then
l ^ n - i l ^ n - l H / m ) - 1 = (n-l)a, and (14) follows as in the proof of Theorem 1. Write E
for the expectation over the joint distribution of Plt ...,Pn conditional on there being no
nearest neighbours, i.e. no P t lies in a figure belonging to any other point. Write B1 for the
expectation over the distribution of Pn+1 conditional on its not being in any of the figures
belonging to Pv ..., Pn. Let/(P n+1 ) be the number of points of the figure of Pn+1 which do not
lie in An. Then if P n + 1 is in An, f(Pn+1) < 3. The average value of f(Pn+1) if Pn+1 had equal
probabilities of being any of the lm points of the lattice would be 5(1 — | An \). Let Sx and S 2
denote summation over the points in An, and the points not in An, respectively. Then