Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
158 Miscellanea Biometrika (1979), 66, 1, pp. 158-62 Printed in Great Britain The closest pair of N random points on the surface of a sphere BY P. A. P. MORAN Department of Statistics, Australian National University, Canberra SUMMARY Some key words: Geometries probability; Nearest neighbour; Pointe on a sphere. 1. INTRODUCTION Consider N random points on the surface of a sphere of unit surface area, and represent them by vectors x1,...,xN, of length (47r)~*, which we suppose to be independently and isotropically distributed. Define dfa^Xj) to be the angle between xt and Xy In a letter, Dr Alan Perelson, University of California Los Alamos Scientific Laboratory, has raised the question of determining PN{6), the probability distribution of the smallest of the %N(N — 1) angles d{xitXj). This problem is of some interest in studying the way in which antibodies attack red blood cells. Clearly no exact answer can be erpected; for if we had an exact answer for each N and 8 we would be able to determine the value of 6 at which PN(6) becomes equal to unity. The knowledge of this as a function of N is equivalent to the classical unsolved packing problem of determining the largest number of spherical caps of semiangle \d which can be placed on the surface of a sphere without overlapping. For references and bounds for this function, see Rankin (1955). The purpose of the present paper is to obtain fairly close bounds for PN(6), and to study its limiting behaviour for N large and 6 small. Note that the analogous problem for points independently and uniformly distributed on a circle is well known and easy to solve exactly. 2. UPPER AND LOWER BOUNDS FOR Write Cf for the spherical cap of centre xi and semiangle 6, and a for its area $(1 — cosfl). THEOREM 1. If(N-l)cc^l, then PN(9)$MN(d) = l - ( l - « ) ( l - 2 « ) . . . { l - ( t f - l ) « } . (1) Proof. Suppose that the points are put on the sphere sequentially and consider the probability that their mutual distances are all greater than 6. If n — 1 points have been put on the sphere, the probability that xn will be distant in angle at least d from all of them will be 1 — | An-11, where An_x is the set union of all the caps Clt..., Cn_1( and | An_x | is its surface measure. A lower bound for 1 — |^4n_1| is {1 — (n— l)a}, so that arguing by induction, l-PN(6)>(l-«)(l-2a)...{l-(N-l)«}, which proves (1). THEOREM 2. / / (N - 1 ) a s? § - V3/(2rr) = 0-391002, (2) Downloaded from http://biomet.oxfordjournals.org/ at Penn State University (Paterno Lib) on May 17, 2016 Consider N points distributed independently and uniformly on the surface of a sphere. Exact upper and lower bounds are found for the distribution of the smallest angular distance between pairs of the points. Similar methods can be used for analogous problems. The present problem arose in the study of the way in which antibodies attack red blood cells. Miscellanea 159 write j8 = { l - f + V3/(27r)}/(l-(tf-l)a)<l. (3) Then PN(6)>mN(9)= 2(l- a )(l-2a)...{l-(n-2) a }^{l-(l-/3a)»- 1 }. (4) n-2 Here, and in all subsequent similar formulae, we make the convention that the continued product which occurs as one factor in each of these terms, is equal to unity when n = 2. To prove (4) we need a preliminary lemma. 2?( 14 B | )>/3-i{l-(l-/}<*)»}. (5) Proof. We prove this by induction. Write E for the expectation under the restriction stated above, suppose that the result is true for n, and write Ex for the expectation with respect to the distribution of xn+1 conditional on d{xi,xn+1)~^6, for t = 1, ...,n and fixed xlt..., xn. Let f(xn+1) be the measure of that part of the cap Cn+1, if any, which is not in An. If ay is the measure of the region common to two caps whose centres are distance 9 apart, f(xn+i) ^ ^ D e n ° t greater than a(l — y) if xn+1 lies in An. The average value of f(xn+1) when xn+1 is uniformly distributed over the sphere without any restriction is clearly a(l — |.4 ft |). Writing Bn for the part of the surface of the sphere outside An, we now have -\An\) = \An\l\An\-*\ f(xn+1)dxn+1 } + \BH\ \\Bn\-i\ fix^dx^X (6) The second term on the right-hand side is | 5 n | ^ 1 ( | J 4 n + 1 | —|^4B|), where the expectation is taken conditional on the fixed values of xlt...,xn. The first term is not greater than a| An| (1 - y ) . We therefore have (7) Now consider y. We show that y > § — ^3/(2TT). TO do this consider two spherical caps of semiangle 8 ^ far whose centres are 6 apart, measured in angular distance. Then use of a little spherical trigonometry shows that the surface area of the region common to both is It can be shown numerically that the ratio of this to the area, $(1 — cos 6), of a spherical cap, increases as 6 increases from zero to fn-. At 6 = \n it has the value \ whilst as 0 tends to zero it tends to the ratio of the common part of two unit circles whose centres are at unit distance apart to the area of one of them. This is easily found to be equal to $-V3/(2TT) = 0-391002. From (6) and the inequality \An\^na, then, if (N-l)a^y, than unity, and ( l - y ) / ( l - | ^ n | ) < j 3 < 1. Then ft as defined by (2) is less ) (9) Downloaded from http://biomet.oxfordjournals.org/ at Penn State University (Paterno Lib) on May 17, 2016 LEMMA. Suppose that a^,..., xn represent n points which are placed uniformly randomly and independently on the surface of a sphere subject only to the restriction that all their mutual angular distances are not less than 6. Let An, as before, be the set union of the caps Olt...,Cn. Then, provided that na^ $ — ^3/(2TT), the conditional expectation 160 Miscellanea and averaging over xv...,xn subject to the restriction that d{xi,xj)>6, for j we have that E{\An+1\)^a + (l-a^)E(\An\). Using this as a recurrence relation and remembering that \A1\ = a,we obtain (5). Proof of Theorem 2. Suppose that the points Pv ...,PN are put on the sphere sequentially and define the event Sn (n = 2, ...,N) to occur if there is an t satisfying l < t <n— 1 such that d(xt, £„)<#, but that d\xit xf) > 8 for 1 < t < j < n — 1. Then PN{0) ia the sum of the probabilities of the exclusive events €n. Consider a lower bound for the probability of the event Sn. The probability that the points xv ...,xn_1 are all at least 6 apart in angular distance is, by Theorem 1, not less than (1— a) (1 — 2a)... {l-(n-2)a}. Then the probability that xn, uniformly distributed, will hit An_x is, by (5), not less than jS-^l - (1 -jSa)"-1}. Combining these inequalities and adding we get (4). S(l-a)(l-2a)...{l-(n-2)a}{l-(l-a)»-1}<m^). n-2 On the other hand, MN(6) can be written as 2(1 - a) (1 - 2a)... {1 - (n - 2) a} (n -1) a, where the sum is over n = 2,.... N, and thus MN(6)-mN(0)^?:(l-«)...{l-(n-2)a}{(n-l)«-l + (l-!x)»-1}. n-2 We also have ( n - l ) a - l + (l-a) n - 1 <i(»-l)(n-2)a*. It follows that as Na-+0, In (1), suppose that Nd-+x, a finite nonzero constant. Then Na = |2V(1 -cos0)->O and (10) and this is the asymptotic form of PN(d). This agrees with the asymptotic results obtained by Efron (1967) and Miles (1970) for random points in finite parts of Euclidean space. Note that the right-hand side of (10) is a Weibull distribution. Using (10) and writing d^n for the smallest of the distances d(xit xf), we can easily prove that Ndmto has asymptotically a mean equal to -J{2TT). 4. NUMERICAL EXAMPLES It is not difficult to calculate the upper and lower bounds (1) and (4) on a desk computer. As examples consider the cases N = 10, 6 = 10°, 20°, 30°, and N = 1000, 6 = 0-1°, 0-2°, 0-3° and 0-4°. Each pair of bounds for N = 1000 took about 3J minutes on a Hewlett-Packard 9820A. The results are given in Table 1. The condition (2) breaks down for N = 10 between 6 = 20° and 6 = 30° and the result (4) no longer holds. For larger N the bounds become very close in the region of interest, as is shown by thefiguresfor N = 1000. Table 1. Calculation of M^O), m^d) forN= 10,1000 N = 10 e {N-l)a 10° 0-068365 20° 0-271383 30° 0-602886 mA.0) MAO) 0 0-291827 0-736421 . 0-295564 0-778851 0-981599 0-1° 0-2° 0-3° 0-4° N = 1000 (N—\)<x mAO) 0-000761 0-316431 0-003043 0-781563 0-006847 0-966733 0-012172 0-996451 MAW 0-316472 0-781967 0-967657 0-997782 Downloaded from http://biomet.oxfordjournals.org/ at Penn State University (Paterno Lib) on May 17, 2016 3. LIMITING BEHAViotm OF Pf^O) Next we observe that y3—X{1 - (1—/fa)"-1} decreases as /? increases from zero to unity. We can therefore replace the lower bound mN(6) in (4) by Miscellanea 161 To modify (4) to take account of the breakdown in (2) suppose that we still have (N - 2) a < 1 and argue as follows. Let n0 be the largest value of n such that 7ia<|-^3/(2ir). Then, following through the proof of the lemma, we still have E(\An\)^ fi-^l - (I - {3a)nt}, and following the proof of Theorem 2 we have, for (N—2) a < 1, that { n-i (11) 6. MlSCEXiLAlTEOTrS KEMABKS (12) cos6)*. (13) These inequalities can be strengthened to include more terms of the standard inclusionexclusion inequalities, but only at the expense of evaluating awkward integrals. The inequalities (12) and (13) and higher inequalities of the same type provide quite good bounds when Pff(9) is less than about £, but above this they are unsatisfactory. The main trouble is that as N-*-oo, 6->- 0, N8-*-x, the upper and lower bounds obtained by this method do not converge to each other but converge to the successive finite term approximations to 1 — exp (—fax*)which bracket the true value. Methods of the type used in the present paper can also be used to estimate the distribution of the distance between the closest pair of N random points in higher dimensions and in other subsets of Euclidean space. As an example consider N random points in a square S of unit area. Then if the whole pattern is taken as doubly periodic in the plane, or what is equivalent, wrapped around a torus, Theorems 1 and 2 remain true if 8 is taken as the Euclidean distance between pairs of points and a is redefined to equal nd2. If the pattern is not denned to be periodic we have a problem involving boundary effects. Let An be the sum set of the circles Clt...,Cn and let A'n be the common part of An and the square S. Then | A'n | < | An | < na, and the inequality (1) remains true. An inequality for 1 — | An-11 can also be established but would be of rather an awkward form and it is scarcely worth setting down the details here. The methods of the present paper could also be used to consider the clustering of points on rectangular or other types of lattices. As an example consider a rectangular lattice of ixffl points and suppose that N of these are randomly chosen. Call these 'black' and the others 'white'. We may ask for the probability, pN say, that at least two black points are adjacent, either horizontally or vertically. Then we can argue exactly analogously to the proofs of Theorems 1 and 2. For simplicity we suppose the whole lattice is repeated periodically, or what is the same, wrapped around a torus. With each black point associate the four nearest neighbours and call the resulting set of five points a 'figure'. This corresponds to a cap in Theorems 1 and 2. Write a = 5/(lm) and denote the N black points by Downloaded from http://biomet.oxfordjournals.org/ at Penn State University (Paterno Lib) on May 17, 2016 Another approach to finding bounds on PN(6) is to use an inclusion-exclusion argument. Given N random points represented by xlt • •-,xN, independently and uniformly distributed on the surface of the sphere, define $N(N— 1) random variables Aif (i,j=l,...,N;i+j) such that Ai} = 1 if d(x{, xf) < 6, and equals zero otherwise. Then a standard result is that lEA'LE)^PN(e)^'LE(Aii). It is easy to see that E{Ai}) = a = £(l-cos0) and — cos #)2 if Ati and Aa are different; this includes cases where one suffix is repeated. It follows that 162 Miscellanea Plt ...,PN. Then if 5(N -1) < lm, we can prove that pN^l-(l-cc)(l-2a)...{l-(N-l)cc}, (14) pN> S(l-«)(l-2 a )...{l-(n-2) a }i3-i{l-(l-i3 a )»- 1 }, (15) and if 5(^-1)<\lm, n-2 The first term on the right-hand side is not greater than 31 An \ whilst the second term is If 5(n— 1) <§Jm we have \An\ <f and we put 3/5 3/5 Then ^ 1 (| J 4 B + 1 |)^|^4. l l | + a ( l - ^ | ^ 4 n | ) , where a = 5/(/m). Arguing as before we get, since \A-y\ = a, ^ ( | ^ n | ) ^ ^ 3 - 1 { l - ( l - ^ a ) n } ) which is certainly no less than l - ( l - a ) n . The rest of the argument then follows as before. REFERENCES EFBON, B. (1967). The problem of the two nearest neighbours (Abstract). Arm. Math. Statist. 38, 298. MXLES, R. E. (1970). On the homogeneous planar Poisson point process. Math. Biosci. 6, 86-127. RAUXIN, R. A. (1955). The closest packing of spherical caps in n dimensions. Proc. Glasgow Math. Aesoc. 2, 139-44. [Received June 1978. Revised September 1978] Downloaded from http://biomet.oxfordjournals.org/ at Penn State University (Paterno Lib) on May 17, 2016 where 0 = £{1 - 5(N- lj/^m)}" 1 . Let An be the set of all points Pv ...,Pn and their nearest neighbours, i.e. the set sum of the first n figures. Let \An\ be (lm)-1 times the number of distinct points in An. Then l ^ n - i l ^ n - l H / m ) - 1 = (n-l)a, and (14) follows as in the proof of Theorem 1. Write E for the expectation over the joint distribution of Plt ...,Pn conditional on there being no nearest neighbours, i.e. no P t lies in a figure belonging to any other point. Write B1 for the expectation over the distribution of Pn+1 conditional on its not being in any of the figures belonging to Pv ..., Pn. Let/(P n+1 ) be the number of points of the figure of Pn+1 which do not lie in An. Then if P n + 1 is in An, f(Pn+1) < 3. The average value of f(Pn+1) if Pn+1 had equal probabilities of being any of the lm points of the lattice would be 5(1 — | An \). Let Sx and S 2 denote summation over the points in An, and the points not in An, respectively. Then