Download Sen, P.K.; (1966)On a class of non-parametric tests for bivariate interchangeability."

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

Transcript
I
'.I
I
I
,I
"I
I
I
,Ie
ON A CLASS OF NONPARAMETRIC TESTS
*
FOR BIVARIATE INTERCHANGEABILITY
PRANAB KUMAR SEN
University of North Carolina, Chapel Hill
and University of Calcutta
Institute of Statistics Mimeo Series No. 462
February 1966
,
il
I
:'1
r
il
I
I
I
I
.e
I
*Work supported by the Army Research office, Durham, Grant DA-3l-l24-G432.
Department of Biostatistics
University of North Carolina
Chapel Hill, North Carolina
I
I
.e
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I
.e
I
SUMMARY
A class of nonparametric tests is developed here for testing the hypothesis of
interchangeability of the two variates in a sample from a bivariate population with
an ururnown distribution function.
~1e
classes of alternative hypotheses relate to
possible differences in location and/or Scale parameters
~f
the two variates, and in
this light, the performance cllaracteristics of the proposed tests are studied and
compared with those of the standard parametric tests for the same problems (based on
the assumption of bivariate normal distribution).
1.
lliTRODUCTION
In a random sample of size n drawn from a bivariate population haVing a continuous ctumilative distribution function (cdf) of unspecified form, a class of nonparametric rank order tests is proposed and studied for testing the null hypothesis of
interchangeability of the two variates against admissible families of alternative
hypotheses which relate to shifts in locations or difference in dispersions of the
two variates.
In this context, the concept of rank permutation tests is developed,
and with the aid of this, a class of genuinely distribution free rank order tests is
constructed.
Further, the well known theorem of Chernoff and Savage (1958) [also
see Govindarajulu etal (1965)] on the asymptotic normality of a celebrated class of
nonparametric test-statistics (in the case of two independent samples) is extended
here to the bivariate (or paired sample) case.
This takes care of the study of the
asymptotic properties of the proposed class of tests.
Certain variational cases are also considered here. First, the n sample observations are drawn not from a common population but from distributions which may possibly differ from each other by shifts in location only. As we shall see later on that
this situation often arises in connection with design of
e~eriments
a pair of treatments which are replicated in two or more blocks.
involving only
Secondly, the two
variates may be actually interchangeable only when each of them is normalized by suitable location and scale parameters which are not known.
In such a case, one may be
.i.
nteres'veJ .:.ntesting the identi t,y of locations i'li tnout assUl:ung the scales to be tile
same, Jr in testing the equality of .cales without assuming the equality of locations.
These problems may be
of paired t-test and
regarde~
as the nonparametric generalizations of the problems
P~tman-Mbrgan
test for the equality of variances, respectively.
Finally, in many psychometric problems, tests for interchangeability of two or more
correlated variates arise in
conne~;~?n
with testing the parallelity of two or
tests (in a battery of tests administered on a common batch of SUbjects).
(1946) has proposed and studied the likelihood ratio. test criteria
for the hypotheses
HMvc'
~
and
BYc
Wilks
~C' ~ and
respectively, where H stands for the
Lyc
l~othesis
of equality, M for the means, V for the variances and C for the covariances.
paran~tric
rr~re
Non-
generalizations of these likelihood ratio tests are considered here only
for the bivariate case, while the more general case of
p(~
2) variates will be consid-
ered in a separate communication.
2.
~
Let
I
-'I
I
I
I
I
I
I
STATISTICAL FORMULATION OF THE PROBLEMS
= (XICX'
X ) be a bivariate random variable drawn from a population hav2CX
ing a continuous cd! Fcx(~, ~), and let ~, "', ~ be n independent random variables
distributed according to the cdf's F , ••• , F respectively, vnlere these F , ••• , Fn
l
n
l
need not be identical. Let no":r n be the set of all continuous bivariate cdf' s, and
_I
.j-.;
,
'I.
I:
'!
it is assumed that
Fa
€
n
for all a
= 1,
,
(2.1)
••• , n.
2
Let us denote the two dimensional Eucledian space by R , and let
W be
a subset of n
I
Ii
1
-'1
I:
for vlhich
j
(2.2)
If (2.2) holds, we say that X anc: X2Q: are interchangeable ::>1' ~ is an interchangela
able vector. We are interested in testing the null hypothesis
2
I;
I'
-,
I
I
(2.3 )
I.
I
I
I
against various types of admissible alternatives. Now, in testing the null hypothesis
(2.3), we may. have either of tIle following two situations.
case where
I
Ie
I
I
I
I
I
I
I
,.
I
a
= 1,
••• , n are n independent and identically distributed bivariate
random variables (i. i. d. b. r.
v.) distributed according to a common cdf
F
€
n ,
and we want to test the null hypothesis
I
I
I
~,
First, the conventional
H:
o
F
€ W
(2.4)
•
Secondly, (Xa)'s are independent but may not be identically distributed.
Often, in
such a case, we have
~
= Za
~ +
!ex'
a
= 1,
(2.5)
••• , n
= 1, ••• , n) are n real quantities
wherek is a 2-vector with unit elements, (Za'
a
which mayor may not be stochastic in nature, and
!a
= (Yla, Y2a ) are such that
under the null hypothesis to be tested, they are interchangeable, for each
a
= 1, ••• , n. Situation (2.5) arises in design of experiments involving only a
pair of treatments which are replicated in two or more replicates.
effects under additive model correspond to Za ' a
= 1,
The replicate
••• , n, while the treatment
cum error components are (Yla , Y~), a = 1, ••• , n. Under the assumption of no
treatment effect, ~ will be an interchangeable vector, and hence to test the null
hypothesis of no treatment effect, we may desire to test for the interchangeability
of
!a without
presuming the knouledge of Za' a
= 1,
"'} n.
In testing the null hypothesis (2.3), the class of alternatives we are usually
interested, relates to possible differences in locations or dispersions of the two
variates.
To pose these, we let Fa
= F for all a = 1, ••• , n, where
(2.6)
3
the cdf F (u, v) being assumed to be invariant under interchange of its arguments,
o
2
for all u, v € R , and ~ = (~l' ~2) and ~ = (51' 52) being respectively the location and the scale vector witl1 real elements.
In the classical location problems,
we are interested in testing H ·,in (2.4) under the assumption 51
o
want to test for the
identity~5f
locations
~l
and
= 82 i. e., we .
~
assUD;ling the equality of 8
1
Similarly, in the classical scale problem, we w~nt to test for the equality
and 8 ,
2
of 8 and 8 in (2.6) assuming the equality of ~l and ~2' In the studentized
1
2
location problem, we shall be interested in testing the null hypothesis ~l and
~2
without assuming the identity of 8 and 8 , Similarly, in the general scale problem,
2
1
we are interested in testing the equality of 8 and 8 without assuming the equality
2
1
of
~l
and
~2
or their differ,ence to be known.
Finally, referred to (2.6), we may be also interested in testing the compound
hypothesis
I
.-I
I
I
I
I
I
I
against the set of alternatives that at least one of the two equalities does not
hold.
This will be the nonparametric analogue of Wilk t s (1946)
~C
in the bivari-
ate case.
Suitable families of rarut order tests are offered for each of the above hypotheses and the performance characteristics of these tests are studied and compared with
these of the corresponding standard parametric tests based on the assumption of bivariate normal distribution.
3.
Let
~
RANK ORDER TESTS FOR I. I. D. B. R. V. SET
= (Xla, X2a ),
a
= 1,
••• , n be n i. i. d. b. r. v. 's distributed accord-
ing to a common continuous cdf F(X , ~). We pool these n paired observations into
l
a combined sample of size N(= 2n), and denote these 2n observations by
_I
I
I
I
I
1
1
I;
,1
e:
I'
1
i
~ = (~, l' ••• , ~, N)
4
I:
I
I.
I
I
I
I
I
I
I
Ie
I
I
Where we adopt the convention that
. (3.2)
Let us arrange the N observations in
by
< ••• < ~(N);
I
I
,.
I
(3.3 )
by virtue of the assumed continuity of F(X , X ), the possibility of ties in (3.3)
2
l
may be ignored, in probability. Now, we consider a sequence of rank functions
(Which are real valued functions of N,) denoted as
EN i being an explicit function of i and N, for i
,
N
.,
.
= 2, 4, ••••
= 1,
... , N,
and defined for·each
Let us also define
_ {l if ~J(i) is an Xla ,a = 1, ••• , n;
CN, i 0, otherwise; for i = 1, ••• , N.
(3.5)
Let then
(3.6)
so that
I
I
I
(3.1) in order of magnitude and denote them
N
N
CN •
....
~len,
C • = '\ CN·
N
~=I
l.
i=l
i=l
2
,~
(3.7)
. = n.
,~
we consider a statistic
Tn =
~•~/ ~•~= ~
N
I
(3.8)
EN} i CN,~• •
i=l
Our proposed test is then based on the rank order statistic T , and by considering
n
various possible
~
(with emphasis on different classes of alternative hypotheses)
we will get a class of tests which 'rill be studied here in detail.
5
I
l
e
I
Then (R , ... , Rn) will be a permutation of the numbers (1, ••• , N). We denote by
l
R~ the column vector corresponding to R in (3.9), and define a 2 x n matrix
a
I
I
I
which we term the collection rank matrix.
The matrix
are all distinct and which form a permutation of the
trix thus consists of n random
is itself a stochastic matrix.
n~bers
(1, ••• , N). Tile maThus,
~
Two such collection matrices ~ and ~ (Say,) are
inversions of the columns of tile later.
(!:I.' ••.•.'
has 2n (=t'l) elements which
which constitute its columns.
raru~-pairs
said to equivalent if it is possible to arrive at
significance too.
~
~
from
~*
by a finite number of
This equiValence has some s.tatistica1
This means that if inst.ead of taking the observation vector as
~), we take it as (~l' ••• , ~n) where (i1 , ... , in) is any permutation
of (1, ••• , n), then the two collection rank matrices corresponding to the two
observation vectors will be equivalent, (as it should be).
In this manner, we
achieve the uniqueness of the collection rank matrix for any given (JL
N-:I.'
The total number of possible realizations of
~
... ,
~).
(on non-equivalent sets only) is
evidently equal to (2n):/n: , and the set of all such possible realizations of ~
is denoted by ....N.
Thus, any
matrix with column vectors
observed"~
Ri' ••• ,
lies in "'N.
Now,
~~
is a 2 x n
stoCh~stic
R~.
If we permute the two rank elements within
n
each of the n colums, we will get a set of 2 possible rank matrices which can be
obtained from~.
This set is denoted by S(~)2 so that~€ S(~). On the othern
hand, S(~) is a subset of ~, there being [(2n)1/ 2 ·nJ] such SUbsets. Thus, we
get
6
I
I
I
_I
I
I··1
.
1
II
I!
I
II
Ii
I'
e.
I
I
I.
I
I
I
I
I
I
I
The set S(~) will be termed the permutation-set of ~.
The probability distribution of ~ over NN (defined on an additive class of
subsets ~ of, .....N:) will evidently depend on the parent cdf F even under the null
However, if the null hypothesis H in (2.4) holds, given~, the
o
first variate may assume either of the two 'Values X , X with equal probability
la 2a
(i.e.,
for all a = 1, ••• , n. Hence, given~, the variable Xla can have either
hypothesis (2.4).
t)
of the two ranks R , R witll equal probability. Thus, given~, the conditional
la 2a
n
distribution (under H ) of the rallies of (Xl' a = 1, ••• , n) over 2 possible
o
a
realizations (obtained by intra-pair permutations) would be uniform•. This implies
that, if corresponding to the given rank matrix ~ we consider the permutation set
n
S(~), then given S(~) there will be 2 possible realizations of ~ Which are
conditionally equally likely, viz.,
Ie
I
I
ThUS, if we consider the statistic Tn in (3.8), it follows from
n
the above discussion that given the set S(~) there will be 2 equally likely
I
realizations of T.
I
I
I
I
,.
I
for any S(R
T).
~~
(conditionally) realization of ~ which in accordance with
n
(3.8) will lead to 2n
This set of values of T is denoted by T [S(R_)]. Thus, from
n
n
~
(3.12), we get that conditionally on T [S(R_)], the permutation distribution of T
n NN
n
n
over the 2 elements of this set (not necessarily all distinct) would be uniform
when H in (2.4) holds.
o
Let us denote this permutational probability measure by
r n, and consider a test function
~(~)
Which with the aid of
r n associates to each
probability of rejecting H in (2.4), so that 0 ~ ~ ~ 1. Since r is a como
n
pletely known probability measure, we can always select ¢(~) such that
~a
7
e being the preassigned level of significance of our test.
E(¢(~-)
NN
Stein
I H)
0
=
e.
(1949) that
(3.13) implies that
Hence, it follows from a well-known result of Lehmann and
¢(~)
has the property of S(€)-Structure of tests and hence is a
strictly distribution free similar test of Size
e.
In actual practice, if n is not large, then corresponding to our given ~ we
n
have a rank-matrix ~Whose permutation set S(~) will then consist of 2 elements
n
(~), and corresponding to these elements we have the set of 2 values Of.T which
n
is denoted by Tn[S(~)].
Thus, conditioned on the given Tn[S(~)], we can find out
n
the permutation distribution function of T (over the 2 elements in this set), and
n
the same may be utilized to construct the exact test
~(~).
However, if n is large,
the labour involved in this computation increases considerably, and as in the case
of majority of other permutation tests (for various other problems in inference,)
we shall develop first the asymptotic permutation distribution theory of Tn and
Extending the idea of Chernoff and Savage
(3.4)
(1958)
under the relaxed conditions of Govindavajulu et al (1965), we define
(4.1)
need be defined only at i/(N+l), for i = 1, ..• , N and its
N
domain of definition may be e1~ended to (0, 1) by the Cherpoff-Savage convention.
where the function I
Let us then define
FN[i](X)
= ~
I'
I
I
I
I
I
_I
4. ASYMPTOTIC PERMUTATION DISTRIBUTION OF Tn
as well as on the parent cdf F.
_I
I
use the same for the construction of large sample permutation tests.
For this study we shall impose certain regularity conditions on ~ in
I
I
I
I.
Il
I
IJ
I:
... 1
[Number of Xia
S x],
i
= 1,
2;
(4.2)
(4.3)
(4.4)
8
-,
I~
I
I.
I
I
I
Again~
let F[i](X) be the marginal cdf of Xia, i = 1, 2, anQ let
(4.5)
For the study of the asymptotic permutation distribution of T , we shall assume
.
that
••I
= Nlem= co
J(H)
(c.l)
r
J.
N
N
i
(c.2)
,I
J
I
I
II
I
I
I
I
I
I
n
a=l
[IN
J [IN(N~l
EN
1 and is not a constant.
(N~l) - J~N~l)J = o(N-~),
co
and
°< H <
(H) exists for
(x»
J(N~l EN
-
(x»
(4.6)
JdFN[l](x)
= Op(N-~).
(4.7)
-co
(c •.3)
IJ
(")
1
° < H < 1,
J(H) is absolutely continuous in H:
(H) I
di
.
= I --r J
1
(H)I ~ K[H(l_H)]-l-~
8
and
,
(4.8)
dH
for i = 0, 1 and some 8 > 0, vnlere K is a constant.
In addition to these three
conditions, we require the following two conditions only for the asymptotic coni
vergence and non-nullity of tIle variance of the permutation distribution of nar •
.
n
(c.4)
i
N
I [ ~ (N~l)
-
I- (N~l) J
= 0(1);
(4.9)
ex=l
co
co
11
[IN(!k-
EN
(x»
IN(N~l EN
N
- J(N+r
EN
(x»
(y»
N
J(N+l
EN
(y»
= 1,
2;
1
dFN(X, y)
..J
= 0 P (1)
(4.10)
Also, "I-'le define
co
Vii (F)
=
J
I(H(x»
dF[i] (x) for i
(4.11)
-co
co
v12 (F) =
co
J ~f
-co
J(H(x»
J(H(y»
-co
9
dF(x, y),
(4.12)
I
and let
/Vll
=\
v(F)
'
(4.13 )
(F)
,v
I'J
(F),
12
"
Let no be the set of all continuous bivariate cdf"s,for which v(F) in
(c.5)
I'J
(4.13) is positive definite.
F
€
Then ive assume that
no c n.
(4.14)
For any finite N, let us define
~=
N
co
+ y~cx,CX = 1
IN
(~I\r
(x»
~(x).
(4.15 )
CX=l
Using conditions (c.2) and (c.3) :and defining ~=
JIJ(H) dH, we readily get that
(4.16)
=
.r
~
N
12
;r (H) dH,
(4.17)
L
N,RICX
N, R2cx
Finally, vIe define
(4.18)
Using then (3.8) and (3.12), we get readily that
I r n) '= EN
n
= ~ +
1
0
(N-2) ,
(4.19)
(4.20)
Now, from (4.5), (4.ll) and ~4.l7) we get that
o<
I
I
-I:
1
a=l
E(T
I.
"
'which by (e.l) and (c.3) will be a finite non-null quantity.
n
a 2 =..l:.... \' [E
_ E
]2
N
I
I
at
Let us also define
2
v
.-I
vll(F) + v22 (F)
=1 1
,
= 2
(H(X»
d H
10
= 2i
~j
I,
I
I,
I
d [F[l] (x) + F[2] (x),]
1
,fI(H)
1
I
<
co.
(4.21)
11
e I:
1
I~
I
I.
I
I
I
I
I
I
I
Ie
I
THEOREM 1+.1 If F
••I
o
PROOF From (4.18) we get that
N
2
O"N
1
=N
\'
L
N
2
EN,a -
2
N
a=l
\"
E
EN, R N,R
la
2a
a=l
(4.22)
I.
Also, from (4.1), (4.3), (4.9) and. condition (c.3) it is easily seen that
N
1 \
N
LJ
a==l
co
(N~l ~(x»
E2
N,a
d.Hyg(n)
(H) dH + 0(1) == }
=
J~(if1r ~(X»
+ 0(1).
~(x)
+ 0(1)
(4.23 )
So we require to show that the second term on the right l1and. side of (4.22) converges to
co
-co
v
12
(F).
Now, using (4.1), (4.3) and (4.12), the same can be written as
co
-co
co
=JJ
(4.24)
-:.0 -co .
AlSO, using (4.4) and (4.5), we have
I
I
I
n and the cond.itions (c.l), (c.2), (c.3) and (c.4), (c.5)
hold, then
~o
I
I
I
€
{
Sup 1
2
x n'21 2n~1 FN [i ] (x) - F [i ] (x)
}l
I
-!
(4.2.5 )
Thus, using the wellknown results on the univariate Kolmogorov-Smirnov statistic
we get from (4.25) that
s~p N~ IN~l HN(x) -
H(x)
is bounded in probability.
11
(4.26)
I
Again b:t- sh:rple algebraic mani'pulations we get'tha.t
HN
o ~ FN(i] (x) ~ 2
for
(x), 0
~ [1 - FN[iJ
(x)]
~ 2 [1 -
HN
(x)]
(4.27)
= 1, 2. Proceeding then precisely on the same line as in the proof of Theorem
i
5.2 of Puri and Sen (1966), He arrive at the stochastic convergence of (4.24)
v
12
(r).
toX
Hence, we get that
2 P 2
erN
~ v -
(by (4.21),).
1
= T[ vll {F)
v (F)
12
(4.28)
+ v
(F) - 2v12 (F)]
22
It remains only to show that if F
€
2
no' then v -
v
12
(F) > O.
From
(4.28), we get that
(4.29)
Where I 2 = (1,1) and X (F) is defined in (1;.13). Sinse for F € no' ")(F) is positive
definite and (4.29) is a ~uadratic form in ~ (F), with a non-null vector 12 , it follows
readily that it will be a.lso
~
positive quantity.
Hence, the theorem.
THEOREM 4.2 If F
n and the conditions (c.l) through (c.5) hold, then under the
€
o
I
I
I
I
I
I
_I
fJ
1
permutational probability measure ?n' the statistic N2 (T
- ~N)f~N has asymptotically,
n
in probability, a normal distribution with zero mean and unit variance.
~
--I
From (3.4), (3.5), (3.8) and (4.15) we readily get that
I,
I
1
1
n'.;if = -1...n!'\
F
-
N
-L[E
2
N, RIa
- E
(4.30
H
N, R2CX •
11
11
1
Now, the permutational probability measure r defines a set of 2n (conditionally)
n
e
1 ~ n
equally probable values of Tn - ~ which may be expressed as Ii ) CX=1 ~ a' Where
'-'
~
,a
' a
= 1,
P(d.~I'l,.....
N
,
.•. , n are independent and
1
= -2 [E'I\l
..1,
,.,
.l.\la
- E
N, R2a
]
I?n ) = 1,
2
(4.31)
P (eL
"TI,a
= - 12
[N
N, RIa
- E
N, R2a
H
f~
n
.
= 12'
12
. 1
J
Ii
I!
-I:
I
I
I
.e
I
I
I
I
I
I
II
I
I
I
I
I
I
.e
I
Let us nOvl d.efine another set of n independent random variables {W. ' CX = 1, ... , n}
N,CX
by
(4.32)
Then, it follows from candition (c.2), (4.30) and (4.32) that under the
per~ltational
probability measure P n
(4.33 )
1
n
Hence, it is sufficient to show that n- 2E CX=l WN,cx has (under the permutational probability measure P ) asymptotically a normal distribution. To prov€ this, we will use
n
the Berry-Essen theorem [ef. Loeve (1960, P 288)], which may be stated as follows:
Let (Z.) be a sequence of independent random variables with means
l
2
{cr.) and absolute third moments
{~.),
l
l
{~.)
, variances
l
and let
(4.34 )
Also, let Gn(X) be the cdf of E CX~l(ZCX - ~cx)/Sn and <.p(x) be the standardized normal
cdf.
Then there exists a finite constant
!Gn (x) - <.p(x)! < C pn
c«
ro), such that for all x.
/ S3.
(4.35)
n
Thus, if we t~~e WN,cx for Zcx' cx= 1, ••• , n, then from (4.32), ,re get that ~CX
=0
Also, precisely on the same line as in theorem 4.l, we get
for all CX= 1, ••• , n.
that
n
13
2
n n
:=
1 \' E(~
n
L.
CX=l
N,cx P n )
=
~
n
R
J(~)]2
N+l
[
cx=l
(4.36)
Again, using condition (c.3) and (4.32), we get that
13
I
n
1.
p = 1 \' E( Iw.
I 31 ? )
n n n L.
N,a
n
a=l
(4.37)
From,
(4.36) and (4.37), we readily get that
(4.38)
(4.35), we get that under the permutational probability measure r n,
and hence from
1
n-2' r.tt~l WN,a/crN has asymptotically, in probability a norr(lal distribution with
zero mean and unit variance. The rest of the proof follows from (4.33).
Hence, the theorem.
(It may be noted that the probability measure IP
n
being a conditional measure (depend.
ing on ~), the above theorem 110lds, in probability, i.e., for all most all ~.)
Before we proceed to formulate the permutation test function $(~) for both
l..
small and large samples, we will study the asymptotic distribution of n 2 (T
n
-~)
for
arbitary parent cdf F, as the sarne will be required subsequently to study the
asymptotic properties of our proposed test.
5. ASYMPTOTIC NORMALITY OF Tn FOR ARBITRARY CDF
It follows from
(3.8), (4.1) (4.2) and (4.3) that
1 IN[N~l ~
C:l
Tn =
(x)] dFN[l] (x),
which has a very close analogy with Chernoff-Savage type of Statistics, (with the
only difference that here there are n paired observations,
there are N independent observations).
=1
i~1ile
in the other case
Let us then define
c;()
~n(F)
J(H(x» dF[l](x),
(5.2)
14
e-I
I
I
I
,
I
I
ttl
I
I
I
I
I
I
,
Ii
e.i
I:
I
II
I
I
I
I
I
I
O"~
.-
I
~
{J J
-co
<
X
<
F[2] (x)[l - F[2](y)] J'(H(x»J'(H(y)
Y
<
-co
<
-co
THEOREM 5.1
dF[2](X)dF[2] (y)
co
co
-J
F[l](X)[l - F[l](y)] ,,"'(H(x»J'(H(y»
< Y < co
X
dF[l](X) dF[l] (y)
co
+/ J
J[F(X,y) - F[l](X) F[2](y)] J'(H(x»
J'(H(y»dF[l] (x) dF[2](y)}
-co
If 'the conditions (c.l), (c.2) and (c.3) of section
continuous F for which
4 hold then
f~
any
O"~(F) > 0,
1
1
lim
2 (T - ~ (F» 10" (F) < t} = --P{n
n= co
n
n
n
.J21C
2
t
J
e
_.!.x
dx.
2
_00
PROOF
If we write
(i)
Ie
I
I
I
I
I
I
I
(F) =
FN[l] (x)
co
(ii)
J
IN
= F [1]
[N~l ~
(x) + [FN[l] (x) - F [1] (x)],
co
(x)
J dFN[l]
(x) =
-co
JJ(N~l ~
(x»
dFN[l] (x)
-co
(by (c.l),)
(iii)
J(ri:l
~
(x»
iJ(N~l ~
+
=J(H(x»
(x»
(~
- J(H (x»
(x) - H(x»
-
(N~l ~
JI (H(x»
(x) - H(x»
-
N~l ~
(x) J' (H(x»
J' (H(x» } ,
then proceeding precisely on the same line as in the modified proof of ChernoffSavage theorem by Govindarajulu, LeCam and Raghavachari
(1965), we get after a few
simple steps that
T =
n
~n
4
(F) + B
+ B
+ \
C
1,N
2,N
~. i,N,
i=l
where
15
(5.4 )
~
jrFNt2J --' (x)- F [2] (x).} Jf (H) dF [l} (x),
(5.5)
B2 "N
= .~
jrFN[l] (x) - ,F[l] (x)] Jf (H) dF[2] (x),
(5.6)
Cl,N
= - N~l
C2,N
=,f[~
C3,N
=J[J(N~l ~
Bl,N =
J~
(x) Jl (H(x»
c1FN[l] (x)
(x) - H(x)] J'(H(x»
(x»
d[FN[l](X) - F[l](XH,
- J(H(x) -
(N~l '~
(x) - H(x»
(5.8)
J'(H(x»] dFN[l](x')
(5.9)
and
(5.10)
As in the proof of
Che~noff-Savage
1
2, 3 aIle all
0
theorem we shall show later on that Ci,N ' 1=1,
.-
(N-2) and may be termed the higher order terms.
p
Thus, from (5.4),
we get that
n!(T
n
-
Il (F»
n
= n![B';
.~
1,N
B
2,N
]+
n~{
4
)', 'C. NJ•
'-
1,
(5.11)
_I
11
11
j
(5.12)
where
C:)
J[F l [2] - F[2]] J'(H) dF(l] (x),
-
I
I
I
I
I
I
I'.
With this, we consider first the asymptotic distribution of
=~
-.
1
i=l
Bl(Y)
I
I
11
Ii
I;
I
1
co
_r.o
Fl[l] and Fl [2H being the empirical cdf (marginal) of (Xla, X2a) based on a random
16
1
eII
I
sample of size unity.
Thus" it t'ollows from (5.12) that Bl N + B N is the average
2
,
)
I.
I
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I
••I
of n independent and identically distributed random variables
1
{[B (x ) - B (x ) L
la
2
l
2a
Hence, for the asymptotic normality of n 2 [B ,N + B ,N] in (5.12),
2
l
we may use the classical central limit theorem, provided the variance of [B (x2a ) l
a
= 1,
••• , n).
B (x )] is positive and finite.
2
la
Now, using (5.13), we get that
coco
4 Var{I\. (y))
=E{fJ[FII2 ](x)
-Fl [2] (x)] [F1:[2H (y) -F[2] (y)H.
-oo-co
Jf (H(x) )J f (H(x» dF [1] (x)dF [1] (y»)
=2
JJ
F[21 (x) [1-F[2]ly)] Jf(H(x) J'(H(y»
-oo1(x<y<t':)
dF[l] (x) liF[l]
(y~.
(5.15)
Similarly,
4 Var{B2 (x)}
=2
JJF[l]
-~
< x< Y <
(x)[l - F[l] (y)H J'(H(x»
Jf (H(y»
dF[2]
(x~
dF[2] (y);
co
coco
4 Cov(Bl(Y), B2 (x)}
=fJ[F(x,y)
- FC1](X) F[2] (y)H JI (H(x»
-co-co
J'(H(Y»
:oF[l] (x) dF[2] (y).
From (5.15), (5.16) and (5.17), vIe get that the variance of [B (x ) - B2 (JS.aH is
1 2a
equal to (j2(F), given in (5.3). Now, in the statement of the theorem, it is assumed
n
that (j (F) isn~n-zero. Now, from (5.3), (5.15), (5.16) and (5.17) we get that
n
~(F) ~ 2 [Var {Bl (Y»)
+ Var (B
2
(X) )].
(5.18)
Thus it is sufficient to show that Var{B (X )} + Var{B (X »)iS finite.
1 2a
2 la
(4.5), we have
(i) F[i](l - F[i]) 5 4 H(l - H) for i
(ii)
dH (x) ~ ~ dF[i](x), i
= 1,
2•
17
= i,
2;
Now, by
(5.19)
I
Hence, from (5.15), (5.16), (j·.l>} anel (5.20) we. gat after .n fe,vl silrq)le s.teps
_I
.::; is
-
V
r
c~< X
dH
I
(x) dH (y)
< Y < cc
1
1
r
.J 'J(u~
l(u) du -
,;: S[
sr' (H(y»
(H(X) [I':H(y)] JI (H(x»
I.'
\"j
clu)2]
~
1
8
J
?(u) du
o
o
1
2
. .:. 8 K
f
[U(l_u)]1-28 du <.
co,
by condition (c.3).
(5.21)
o
Thus, to complete tIle proof of the theorem, it remains only to show that C N'
i
1:.
.
i = 1, 2, 3 are all 0 (n- 2 ). To d.o this l we first note that'all"the elem.~mtary_
results of Chernoff p and
Savage (1958, p. 186; results 7.A.l to
1
7.A.IO) also hold in our case witil the funther simplification that \- - '\ = 2" •
For brerity, these results are therefore not reproduced again.
Now, as in. the
treatment of C
of Cherenoff and Savage (1958, p 988), we have readily
13N
IC1,N I
L
n
:s N:l'
;
:s
'IJ'(H(Xla)}1
N:l .
a=l
K
~N+r'
1
h
~
I
-i +
n
6
[H(Xla )(l-H(Xla ») .
a:l
n
\
(5.22)
I.
a:l
.
_.1.
Thus, we are to show that C2,~ and C ,N are also 0p(n 2).
3
Let now (~, bN) be the interval SN~ , where
SN
r,E
= ( x:
H(l - H) > q/n}.
Then by the result 7.A.9 of Chernoff and Sava.ge (1958, p 98G) (vlith a simple extension
to the bivariate case,) there exists an ~€ > 0 independent of F(x , x ).
2
1
.
P {X . ~ SN
, i = 1, 2; a= 1, .•• , n J ? 1 - €.
J.a
,€
...
Such that
(5. 23 )
Thus,. wi.tIl probability greater than l-€ , there are no observations in the, complementary region SN
, €
.
K being a constant, which rnay depend on
..
€
and
18
~
€
el
I
I
I
I~
. (5.23)
€
I
I
I
I
I
I
. Furtller, it follows from lemma
I
I
I.
-.
I
I
I.
I
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I
••I
4.2.2 of Govindarajulu et al (1965) that for every "I > 0, t:lere exists a e(?,) > 0,
such that with probability greater than 1 - ~?" we have"
(5.24)
for i
= 1, 2. Thus, from (5.24), we get that with probability greater than
IBN (x)
- H(x)
I=~
2
\
j
[FN[i] (x) -F [i] (x)]
2
j IFN[i] (x) -
I :s ~
i=l
1-"1,
F [iff (x)
I
i=l
2
I
:s ~Jr) n-i
{F[i] (x) [1 - F[iU (x)] }(1-5)/2
i=l
1
=
e*(?,) N- 2 {H(x) [1 - H(x)]}
1'2
8
,
° < c*(?,) <
"
co.
Thus, expressing the integral on the right hand side of (5.8) as the sum of two integrals over the domains of integration SN
,€
and SN
,€
respectively, and noting that by
condition (c.3) and (5.23) (with probability> 1 - €,)
"U
sN,
[~(x)
- H(x)] JI (H(x) d [FN[l](x) - dF[l](x)ll
€
1
~
<2
1-K/N
o
"
= o(n
[H(X)]-~ +8 [i_H(X)]-i+8d[l_H(~
_1-
2);
(5.26)
we get that
Using t11en (5.25), we get that vith probability greater than 1-1',
19
I
~
Nt IHN(x) - H(x) IJ'(H(x»
5
c*(y) [H(X){1_H(x»)U- l + / 2
_I
Since, the right hand side of (5.28) is integrable with respect to H(x), c1FN[{](X)
~ dHN(X), HN(X) a~. H(x), FN{l] a~. F[l](X)' and dF[l](x) ~ 2 dH(x), it can be
shown after a few simple steps (with the aid of(5.27) and (5.28),) that
Nt IC 2,N 1
~O
i.e., IC2 ,N
1=
0p(It
t ).
($~.:29)
.1-
The proof of IC3,Nlbeing 0p(N- 2 ) follows precisely on the sroae line as in the proof
of a similar C ,N (for the case of two independent samples) considered by Govindarajulu
3
et al (1965), and hence for the intended brevity of our discussion, the details are
omitted.
Hence, the theorem.
THEOREM 5.2
If the conditions of theorem
5~1
hold and in addition
(5.30)
where '.I! has a continuous density '!rex) arid eN -+ 0, ~ -+ 1 as N -+ co, then oi(F), del
(defined in (!l:.14),) and n"2(Tn - ~n(F»/crn(F)
has asymptotically a normal distribution with zero mean and unit variance.
fined in (5.3), is positive for all F
€no
PROOF
In order to prove the tlleorem, it appears to be sufficient to show that under
2
(5.30), crn (F), defined in (5.3), is positive.
Then proceeding precisely on the same line as in the proof of corollary 4.12
of
Govindarajul~ et
al (1965), we get that under (5.30) the sum of the first two
1
1--2
integrals on the right hand side of (5.3) converges (as n -+00) to 2{ f ~(u) du
1
0
2
-[ f J(u) dU]}.
Again, by a Similar technique it can be Sh01qn that the last
o
integral on the right hand side of (5.3) converges ( as n-+ ro ) to
1
~ {J l
o
co
(u) du -
co
J..f
-co
J(F[l]lx»
-co
J(F[2n(y»
.
20
dF(x, y)}
(5.31)
I
I
I
I
I
I
I
el
I
I
I
I
I
I
I
,
-.
I
I
I
.e
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I
.e
I
Again fr~m (4.11), (4.12) and (4.17) we
~(}
=
12 (F»
-
V
t
get that (5.31) is nothtDg but
Ivll(F) +v22 (F) - 2v12 (F») >
9 for
all F
no' by condition
€
(4.14) and (c.5).
Hence, the theorem.
Incidentally, from theorem 4.1 and theorem 5.2, we arrive at the following
result that under (5.30)
5.32)
for all F
€
n.
o
This result will be very useful in the next section.
6.
ASYMPTOTIC EFFICIENCY OF RANK ORDER TESTS
We shall now consider some specific rank order test for location or scale and
study their effeciency aspects.
It follows from our results in the preceeding three
sections that the exact permutation test based on T in (3.8) reduces to a very
n
simple form for large values of n.
Also, it follows from theorem 5.2 that if we
have any consistent estimate ~ (F) of ~2(F) in (5.3) (under H ), then a large sample
n
n
a
1
(unconditional) test may be based on the studentized statistic n2 (T n
where ~ is defined in (4.16).
B~
~)/ ~n (F),
virtue of theorems 4.2 and 5.2 and of (5.32),
we may then readily conclude that the large
sa:rqple permutation test and the large
sample unconditional test will be stochastically equivalent for the entire class of
hypotheses in (5.30).
Hence to study the asymptotic power efficiency of the test
based on T , it will be sufficient to consider only the large sample unconditional
.
n
tests.
6.1
Location problem For the matched sample location
prob~e~
and X -B are statistically interchangeable for some real
2a
states that
e = O.
e,
we assume that X
la
and the null hypothesis
We shall consider several rank order tests based on statistics
of the type (3.8), for which we assume further that EN,a is a monotonic function
of
1 <
< N.
As in the case of univariate location tests, the above condition
case ensure the consistency of the tests for any non-nUll
21
e.
We shall consider
I
PpecifiCally the following ~.
(i)
Median Test
E
::::
Here we have
J1 if a ~ [H/2],
l Oif
N,a
(6.1)
a > [N/2],
. . .':'
........
([s] being the largest integer contain in s).
In the univariate case, the correspond-
ing,test is propesed by MOod (1950) and is known as the median test.
For the case
of two independent bivariate samples, the generalization of the test is due tJ
ChatterJee and Sen (1964).
(ii)
Rank-sum test
EN,a::::
Ct,
for
Here, we let
a:::: 1, ••• , N
(6.2)
--I
I
I
I
I
In the univariate case, the cJrresponding test is due to llilcoxon (1945) and Mann
and vhlitney
tion of
t~le
(1947).
For the case of two independent bivariate samples, generaliza-
same has been mac"!.e by Chatterjee and Sen (19611-).
Y-Score test
(iii)
Let Y be some assumed cdf, and let in a
sample of size N drawn from a population having tpe cdf Y,
value of the a-th order Statistic, for a:::: 1, ... , N.
R_
~,Ct
::::
a_
l'J,Ct
~
,a
rando~
be the expected
TIlen, we take
,for a:::: 1, ••• , N.
(6.3)
Sometimes, vIe also take
(6.4)
In pa:-cticular, if Y is a stanc1ardized normal cdf, the sc,:)res {~r a} are teste6. t Lle
,
normal scores and the test as tIle normal, score test.
In tile univariate case, tests
of this type are due to Fisher and Yates, Terry, Hoeffding among many others, and a
,22
el
I
I
'
I·
"
I'
I
I
I
-.
I
I
I.
I
I
I
I
I
I
I
nice account of the same Is given by Chernoff and Savage (1958).
the case of multivariate multi sample situations is due to Puri and Sen (1966).
For the study of the asymptotic relative efficiencies (A.R.E.) of these tests as
well as the parametrically optimum paired t-test, we have to select some sequence of
alternative hypotheses (say,
open interval (0,1).
Hn(1) :
where
F [2] ( x )
I-
•-
{g(l,) for which the
n
p~er
of the tests lie in the
For this, we define
)
= F [1] (-~
x + n e,
(6.5)
e is real and finite and the cdf F[l] (having a continuous density f[l] (x),)
is assumed to satisfy the conditions of lemma 7,2 -of'Puri{1964),Let us then define
1
2
A (F,J)
J?
=
1
(u) du -
o
p(F, J)
=
{,f J(u) dU}2;
(6.6)
0
co
Ie
I
I
I
I
Generalization to
co
[J J
J(F[l] (X»J(F[2](y» dF(x, y) -
-co
.co
..
(6.7)
co
B(F, J)
=
J
J'(F[l](x»
(6.8)
f[lH(x) dF[l] (x).
-co
It is then easily seen that for the sequence of alternatives
the A.R.E. of the rank order test based on Tn is
[B(F, J)/A(F,J)
{1-p(F,J»)~H2.
Ian (l~ in (6.5),
proposti~nal
to
(6.9)
Again, if we consider the paired t-test (under the assumption of homoseedasticity,
consequent on (6,5),), it can be shm'ffi that the A." E. of this test is propostional
to
1/
2
(J'
(l-p),
(6.10)
where (J'2 is the common variance and p is the correlation coefficient of XIa and x2a .
Thus, if we define g ~ (Sl' S2) as the (population) median-vector of (1:a' xa~)'
and write
23
.. ~ ," ~"
.
~
t
(6.11)
then it is easily seen that the A. R. E. of the median test is
(6.12)
Similarly, if I1g stand for the grade correlation of
~
anel x
2a
[cf. Hoeffding (1948,
p. 318)], then the A. R. E. of the rank sum t~st will be e~ual to
co
12[
J
2
2
f[l] (x) dx] /(1- Pg )
=~
(6.13 )
(say).
-co .'
Finally, if we substituee ~-l(F) for J(E) in (6.7) and denote the corresponding
measure by P"" the 'f·-score correlation of xla and
test would be
co
[J
..
(f[l] (x)/
'If[~-l(~.[l] (x»)
dF[l]
~,
the A. R. E. of the 'f -score
(x)]2/cr~(1-~'If) = ElV (say ),
(6.14)
-co
where. cr~ stands for the variance of the cdf 'f',
If Pe F [1] == ~, the above reduces to
l/cr;(l-P)lf) and if further 'If is normal, it is easily seen that P = ~, and hence, the
.
'f
A.R.E. of the·'f·-score test will be e~ual to (6.10), the same as of the t-test. Thus
for normal alternatives, the normal score test and the paired t-test are asymptotically
power
e~uivalent
(for the sequence of alternatives (H(l»
n.
in (6.5».
Further, it
follows from (6.9) that the A.R.E. of any proposed test is the product of two factors;
the
first factor depending 'only on the margiJ,lal cdf F [1] (as A.~F,. T) is independent
of F,) while the second factor being dependent on p(F, J), which depends on both
F and J.
Howeve~,
This makes the usual univariate bounds for A.R.E. inapplicable .in this case.
if we can find bounds for the variations of (l-p(F, J»/~l-P), the same can
..
be ,+sed to study the
.'
I
l
e
I
I
I
I
I
I
I
_I
I
I
11
I
I,
I
.~.
bounds~for
bhe A.R.E.
Unfortunately, this depends on the un-
known F in a very involved way and no universal bounds may be attached.
specific case of normal cdf's, it is well-known that
24
For the
Id'
el
I
I
~
I
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I
•I
1)
4( ~12 - r.
~
. -1
= -2 S~n
~
p
and pg
. -1 p/ 2,
= -6 S~n
(6.15)
~
and hence, we get that the Pitman-efficiencies of the median test and the rank-sum
test with
res~ect
to the paired t-test are respectively
(6.16)
Now, (6.16) agrees with the expressions obtained by Bickel (1965, pp 170-171) for
his median and rank-sum tests for the symemtry of a bivariane cdf (around the origin).
Consequently, the efficiency study made by him can be as it is applied in our case
to get similar bounds for the quantities in (6.16).
Finally, in the leterature, the signed-rank test by Wilcox on (1949) and tIle
normal score test by Govindarajulu (1960).are based on the values of Xla ~ X2a '
a = 1, ••• , n. If we consider the asymptotic efficiency of the type of rank order
test considered here with respect to the similar test according to the other method,
the same will be obtained
(l-p)/[l-p(F, J)],
(6.17)
and hence, nothing can be said, in general, about this.
However, for normal score
test, (fu.17) will be equal to unity, while for normal alternatives, the bounds for
(6.17) for median or rank-sum test may be easily obtained by using Bickel's (1964,
pp 170-171) results.
6.2 Scale problem To the best of knowledge of the author, there is no
nonparametric
test for the equality of Scales parameters of a bivariate distribution.
Here, we
assume that X~ and X2a have a common location
(unknown but real), and (Xla - ~)
and e(x
- ~) are interchangeable for some real and finite'~(~ 0) e; the null
2a
hypothesis being
e = 1•
In the parametric case, the test by Morgan (1939) is uased upon the significance
of the observed correlation between u
a
= X~
25
- X and Va
2a
= Kla
+ X .
2a
This test
is also related functionally to the likelihood ratio
test for the same probl.em,
and hence, possesses some optimum properties too.
~1e
proposed rank order tests for this problem are all based on statistics of
the type Tn in (3.8).
As under the null hypothesis
loss of generality that
e = 1,
we may take without any'
is the population median of either variate, and by change
~
~
of origin, we can always take
= 0. We then assume that the weight-function J(F)
corresponding to T in (3.8) satisfies the condition
n
J(F'(ex»
- J(F(x»
is '~in
is
t in
e for x 2: 0,
e' for x <
(6.18)
0, or vice verSa~
It is easily seen that under (6.18), the rank order tests based on T in (3.8) wi.ll
n
be consistent against any
all the cases.
ef
1.
The univariate scale tests satisfy (6.18) in almost
We shall consider here specifically the followmng rank order statistics.
(i)' R.an}{ mean-square statistic We let
~,a
= (a
N'H 2
- T)
for 1
:s a :s N.
n is proposed
MUltivariate extension of this is due
~o
I
I
I
I
I
II
I
II
II
II
II
J
Puriand Sen (1966).
(ii)
_I
el
In the case of two independent univariate samples, the corresponding'T
by MOod (1954) for the scale problem.
I
SymmetriC rank-statistic 1. Here, we let
EN,a=q(N+l-a)
(6.20)
for 0=1, ... , N.
. r"
1
Univariate tests of this type are dme to Sen (1963) and Chatterjee (1966).
(iii)
EN,a
~-Score
2
= ~,a
statistics With the notations in (6.3) and (6.4), we define
-1 a) 2
or ['1' . (m J ,
a
= 1,
••• , N.
. (6.21)
I
I,
I
-I'
J
As for the study of the asymptotic efficiency of these tests, we proceed
similarly as in the location problem and write
26
',-'
-':'.'
.
J
1-
I
(-~
(2)
I.
I
I
I
I
I
I
I
Ie
I
I
I
I
I
I
I
I·
I
l\; : F[2] (x) =F[l] [l+n 2e]x);
~1l=O);
(6.22)
wher: e is real and finite and F[lJ sa.tisfies the conditions of lemma 7.2 'Qf-_ (1964).
Purl..
We also define A(F, J) and ~(F, J) as in (6.6) and (6.7), and let
C(F, J)
=
l
co
x Jt(F[l](X»
f[l] (x) dF[l] (x).
(6.23)
co
-co
It is then easily seen that for the sequence of alternatives
the A. R. E.
(H~2»)
in (6.22),
of the rank order test based on T will be proportional to
n
[C(F, J)/A(F, J)l/[l-P(F, J)],
(6.24)
while the A. R. E. of the Morgan (1939) test for the same sequence of alternatives
will be proportional to
2
l/(l-p ),
(6.25)
where p is the correlation coefficient of (Xla, X2a ).
Now, if we work with the normal scores i.e., (6.21) when t is assumed to be
normal and if F[l] =~, it is easy to
case will be nothing but p2.
C(F, J)/A(F, J)
show (on using (6.7),) that p(F, J) in this
Further, if follows from (6.23) that in this case,
will be equal to unity.
Hence, fDOm (6.24) and (6.25), we will
get that for normal alternatives (scale), the normal score test and the MOrgan test
will be asymptotically power equivalent.
The asymptotic efficiencies of the other tests may be stUdied precisely on
the same line as in the location problem.
These will naturally-depend on the
correlation p(between ~a ' X2a ) and, in general, no proper bounds may be attached
to these.
7.
TESTS FOR INTERCHANGEABILITY IN BLOCKED EXPERIMENTS
Here we shall work with the model (2.5), and will test for the interchangeability of (Yla, Y ) eld::minating the effect of Za' for a= 1, ... , n.
2a
27
The procedure
will be a ve:by simple modification of what we have done before.
This may be termed
[after Hodges and Lehmann (1962)] ranking after alignment •.
We define X
a •
=
~(Xla + x2a) for ,a = 1, .•. , n and let
~ (YJ,a - Y2a ) for a
1, ••• , n;
=
(by (2.5»/
Thys
(W
a
of (Za ' a
, a
= 1,
••• , n) are n i. 1. d. r. v. vlhicllare free from the effects
= 1, .•• , n) and if really Yla, Y2a are interchangeable, then Wa
w~ll
hav:e.
a distribution symmetric about the origin. Thus, conditioned on the given set
n
{IWa I, a = 1, ••• , n}. There will be 2 possible realized values of {Wa' a = 1, .•. , n)
where each W can assume two values ± IWal, independently of each other and with
a
probability -i if H is true. He now rank the values of IW I and let Ra be the
a
o
rank of IWa I among the n values. Then,.!!. = (R , .•• , R ) will be a permutation of
n
l
(1, ••• , n). Thus, if we define E = (E l' ... , E
) (where E N is a function
"!l
n,
n,n
n, ....
of a/(n+l», then we may define a rank order statistic T by
n
T -
1. L.
\' a
n - n
(7 2)
•
nEd
= 1
n,a a'
Where d is 1 or 0 according asvl is positive or not. Once, we do this, we reduce
a
a
the problem to the univariate problem of symmetry, and all the available tests for
that may be used here.
Incidentally, here, the permutation distribution of Twill
n
be identical:with the null distribution of it, and as in our theorem 4.2, the
1..
_
asymptotic normality of n 2 (T
n
- E ) will follow readily.
n
For this, the details
are omitted, and we conclude this section by a note that the tests based on T in
n
(7.2) will be valid only for the location problem; ,~hile as a result of the reductj.on
of the bivariate
.: '. problem to a univariate problem (of symmetry), we aEe not in a position to test
for the equality of the scale parameters in this manner.
28
I
l
e
I
I
I
I
I
II
I
-I
I
I
I
I.
I
IJ
,
~
eli
I,
~
I
I.
I
'I
I
I
(I
I
I
Ie
I
I
'I
'I
II
:1
I
I·
I
8.
STUDENTIZED LOCATION.AND GENERAL SCALE PROBLEMS
First, referred to
and
(2.6), we want to test for the equality of the location
~l
~2
without assuming the seales 51 and 8 to be equal IDr to bear a knovm ratio
2
to each other. This will be then a proper nonparametric generalization of the
problem of paired t-test, where actually the two variances may be arbitarily
different from each other.
~1e
paired t-test is based npon the assumption of
normality of the bivariate cdf F in
(2.6), while our proposed tests will be vatid
for a wider class of cdf's.
We define a bivariate cdf F(x, y) to be diagonally symmetric around a location
~ = (~, ~2) if for all (x. y) in the two dimensional real space (R2 ),
F(~ - x, ~2 - y)
=1
- F(~l + x - 0, (0) - F(oo, ~2 + Y - 6)
+ F(~ + x - 0, ~2 + Y - 0),
(8.1)
and the class of all diagonally symmetric edf's will be denoted by fis ' Then, we
have the follovling lemma, whose proof is simple and left to the reader,
LEMMA 8.1
5 ::;
~l
If F(x, y)€
fi ,
s
then X - Y will have a distribution aymmetric about
- ~2'
Consequent on this, we note that if F in
then the cdf of X1CX - X2CX will be symmetric about 0, no matter whatever be
the values of 8 and 52' Thus, here also, if we assume that F € fi and we work
1
s
with the values of Wo:::; XlO: - X20:, 0:= 1, ... , n, then ire may proceed pJeecisely on
~l
::;
(2.6) is diagonally symmetric and if
~2'
the same line as in section 7 and get a class of rank order tests which are also
used to test the symmetry in the univariate case.
The efficiency study of such
tests have already been made in connection with the one sample location problem
(univariate) by Govindarajulu (1960), and hence will not be repeated here.
Let us now consider the general seale problem.
want to
t~st
identical.
Here, referred to
(2.6), we
for the equality of 51 and 52 without assuming tIle locations to be
As in the case of two independent samples, one way of approach would
be consider suitable tests based not on the original observations but on the ones
29
centered at the respective 'estimated 10cationpal'ameters [cf; Sukhatme (1958)
and Raghavachari (1965)].
~
and
~2
J..
n2
n1US, we let ~ and ~ to be some consistent estilnates of
respectively, and regarding these we shall assume that
1\
I ~i
- ~il,
i
= 1, 2 are bounded
in~probability.
(8.2)
Then, we define the centered observations as
(8.3 )
Our proposed tests will be then based upon these centered observations.
nms, we may now proceed with these centered observatchons as in section 3
and use a rank order statistic as in (3.8).
This statistic will be termed a
modified rank order statistic and will be denoted by ~.
n
1\
distribution of T will be
n
~uite
For small samples, the
involved and will fail to be
distribut~dn-free.
ThUS, as in the case of unpaired samples, we will confine ourselves only to the
large sample case.
1
n2 (T
n -
We shall study the conditions under, which
I.l
P
'1' ) ~o.
(8.4)
n
If (8.lt) holds, then asymptotically ~n will have the same properties a,s that of Tn'
and theorems 5.1 and 5.2 will also apply to it.
Consequently, the two tests based
on Tn and ~n will be asymptotically pwwer-equivalent for scale alternatives of the
type (6.22).
If ~
THEOREN 8.2
(i)
(ii)
= (~, ~)
sa.tisfies (8.3), and if in addition
F [i] (having a continuous density f [1]) is sy.rt¥etric about ~ifor i = 1, 2;
J(u) is symmetric about u
(iii)
for all It I ~ c, (c and K being constants), Where T(x) is guadratically integrable
= 1,
2; then the modified rank order Statistic based on
the centered observations and the original rank order statistic based on the true
deviations from the respective locations will be asymptotically
probability.
30
_.
I
I,
I
I
I
I
I
_I,
I·
I
I
I
I:
=! ;
J'(F[1](x-t»f[i](x-t) ~ K T(x)
with respect to F[l]' i
I
e~uivalent,
in
Ii
I
-.:
I'
I
I.
I
I
I
I
I
I,
I
II
I
I
I
I
I
I
••I
PROOF
By virtue of these assumptions it is easy to show that
co
J
J' (H(x»f [i] (x) dF [j] (x)
=0
for i, j
= 1,
(8.5)
2.
_co
Proceeding then precisely on the same line as in the proof of theorem 3.1 of Raghavachari (1965), it can be shown that all the regularity conditions for his lemma 3.3
are satisfied, and the rest of the proof can then be completed by our theorem 5.1
The details are omitted.
It is also easy to verify that the rank order statistics corresponding to
(6.19), (6.20) and (6.21) all satisfy the above conditions for a wide caass of
diagonally symmetric (bivariate) distributions.
Hence, in large samples, we may use the rank order statistic ~ for the
n
general scale problem, and the asymptotic power of these will be the same as in the
case of tests based on T in (3.8).
n
9.
TES':
'00
COMPOUND SYMMETRY
Here, referred to (2.6), we want to test Wilk's (1946) hypothesis of
comp~nd
symmetry, which reduces to
Ho·•
against the set of alternatives that at least one of them is not true.
Here, we
can work with a pair of rank order statistics, say T(l) and T(2), Where the first
n
n
one will test the equality of locations and the second one the equality of scales.
It follows from our results in section 3 that given ~ in (3.10), there will be a
n
set S(~)C ~ of 2 (conditionally) equally likely realizations of~, and (3.12)
holds for thll:s, when H in (9.1) hihlds. Thus, if we define T = (T(l), T(2»,
o
""0.
n
n
n
there will be a set
(S(~» of 2 (conditionally) equally likely values of
In
T , and hence, as in section 3, we can construct a similar size C test for the
"'--n
hypothesis (9.1)
However, in the majority of the cases, we shall prefer to use a single valued
'~1ich
statistic
will be a furlction of'T .' If we proceed as in 'section 4 or 5, we
"'Il
may deduce (under similar conditions as in there,) the jomnt asymptotic normality
of T both under the permutational as well as unconditional probability measures.
"'!l
Hence, it seems quite appropriate to use the quadratic form in
In with the
discrimi-
nant as the inverse of the covariance matrix of the same.
Now, corresponding to the rank order statistic
~i)
(as in (3.4»
T~i),
we define the sequence
by
(i) - (E..(i)
E..( i) )
EN
-~, l' .• , -N,N'
.
~
= 1, 2;
(9.2)
where
a
(N+:i)' 1
(1) _
EN,a - I N(1)
S a ~ N,
i
= 1,
2.
The function IN(i) will be assumed to satisfy the r€gularity conditions of section
4.
In addition to this, we say that
N
~ (E\l) _ R~l»
L
a=l
where
~,a
~i),
i
~
= 1,
~l)
and
~2)
are orthogonal to each other, if
(E..(2) ~ E(2» = 0,
-N,a
(9. 4)
N
2 are defined as in (4.15).
The orthogonality of
is not essential for our purpose, but we shall see later on tllat it
the statistic appreciably and it holds in the majority of the cases.
~1)
and
~2)
simpli~ies
In fact,
the location and seale tests considered in sectmon 6 all satisfy this condition
(under suitable choice of 'pairs of T(l), T(2».
.
n
n
If (9.4) holds, and we define the permutation variance of T(i) by cr..2 ./N,
n
~,:L
i = 1, 2 (as in (4.20», then it is easy to show that under the permutational
probability measure P ,
n
I
--I
I
I
I
I
I
I
-I
I
I
I
I:
I,
I:
I:
2
S = N \"' (T (i) _ R~ i) ) 2/ CT 2 .
n
n
~
N,~
.L
i=l
2
has asymptotically, in probability, a x distribution with 2 d.f., and as a suitable
32
-.
I'
I
If (9.4) does not hold, "Ire have to define
test statistic, we may propose Sn'
N COV(T(i) T(j)
n 'n
I.
I
I
=
I
I
I
I
I
Ie
I
I
I
I
J<
I
I
-,!
I
I· -J \.l
I
I r n)
= 0:
"
N,lJ
;
i, j = 1, 2'
'
(9.6)
and
S
n
N
It can be shovm by a straight forward vector extension of theorem 4.2 that
have asymptotically, in probability,
2 d,f.
(under
Hence, in any case, we may use S in
n
r n ),
(9.5)
SN
'nIl
a chi square distribution with
or
(9.7)
as a suitable test-
statmttic, and carry oUD the permutation test based on it both for small and large
samples.
Further, we define
-co
2
• • (F)" (j
n,lJ
n (F) in (5.3) "I'lith '"
(j
+
=~
-co
= J('l')
{JJF[2](X)[1-F[2](y)J{ "'(1) (H(x»
<
X
<
YJ
<
J(l)(H(Y»
X
co
-c:>
co
co
-co
-co
<
Y
<
co
J(2)(H(Y) +
co
J(2)(H(X» }dF[l](X) dF[2](Y)
!F[l](X)[l- F[l](Y)] {J(l)(H(X»
-co
if i=j;
J(2)(H(Y»
+
'-
co
fiF(X,y) - F[l](x) F[2](Y)] J(l)(H(X»
J(2)(H(y» dF[l](x) dF[2](Y)
-co
[F(x,y) -
F[l]~x)
F[2](Y)] J(l)(H(Y»
33
J(2)(H(X»
dF[l](X) dF[2](Y)}'
~len,
bya more or.less straight forward vector generalization of theorem 5.1,
we will arrive at the (unconditional) joint asymptotic normality of
with a null mean vector and the dispersion natrig E(F) = «rr
.. (F»). Theorem 5.2
,."
n, ~J
also extends directly to this case; and it can be shown easily that
p
~,." ~(F) under (5.30).
Thus, in this case also, if we' have any consistent estionator ~(F) of ~(F)J and
we propose a large sample unconditional test based on tIle quadratic form in
In - ~
with discriminant as the inverse of ~(F)J then the permuation test based:on ?n. in
(9.7) will be stochastically equivalent to this test for sequences of alternatives
in (5.30).
Let us now consider two specific rank-order tests for compound symmetry.
(I)
Rank order test, I.
. ~l) = ex,
Here, we let
1
,ex
~
ex
~
(9.12)
N.
~2) = (ex _ (N+l)/2)2 for 1 < ex ...< N.
,ex
-
(9.13)
Thus, T(l) will be the rank-surn test (as in (6.2),)
n .
~or
location, and T(2) will
n
be the ramc mean square test for Scale (as in (6.19». It is easily seen that
(9. 4 ) holds for (9.12) and (9.13).
So our statistic S reduces to
n
I
.-I
I
I
I
I
I
I
_I,
11
I
I
I,
I,
',i,'
"~
N
2
2
erN, 1
where (T,N •
,~
2
O:N, 2
is the permutation variance of ..rN T(i), i = 1, 2, vTl1ich miiy be evaluated
n
as in (4.18).
(II) 'f-score .test H~e~ vre define,~, i as in (6.3), and let
~~~ = ~,ex ,E~~~
=
(~,ex - ~)2,
ex
34
= 1, ... , N,
I'
Ii
-.
I-
I
I.
I
I
I
I
I
I
I
Ie
I
I
...
'iii
where
~
1
=: -
N
l: N
~(l) = t- l
,et
Ci
:;
aN, (1: Alternati vely, we may "take,
0:=1
(N~l)'
1, ••• , N.
Again, if'l' is a symmetric cdf, i t is easily seen that (9.4) holds. (In particular, if y. is normal, then also (9.4) hihlds.)
statistic S in
,
n
(N=1)/2 and
(9.7) reduces to a form similar to
I~ -1)/12,
Hence, for symmetric 'It',
(9~~),
where
the
instead of
we will have to use the respective means of E(l) and &..(2)
~,et
N,et
For normal t, the above statistic will be termed the normal
in (9.15) or (9.16).
score statistic and the test as the normal score test for compound symmetry.
As for the study of the
as~~totic
power properties of these tests, we may
use the extension of Theorem 5.2 considered earlier, and with that for the sequence
of altermatives:
H(3) •
n
(wi tIl
•
e1 , e2
real and finite and F [1] satisfying the conditions of lemma 7:2 of Puri
(1964),) it can be easily shovm that for normal alternatives, the normal score test
and WiD;:s t
Lw test
are asymptotically power-equivalent.
The efficiency of the
normal score test against L test for non-normal alternatives will depend on the
MV
two unknown correlations ~(F, J(l»
and J 2 = [1¥-l(F [1] (x»
;
a· _ !-l(~) ]2,
E(2) = ['1' -1 (rn)
N,Ci
for the same.
i,)
and p(F, J(2»
in (6.7), (Where J(l) ='l'~l(F[l](x)'
and hence nothing can be said, in general, about the bounds
Finally, the efficiency of the other rank test
w~ll
two sucll unknown correlations, and hence, no proper bounds can be
same.
also depend on
a~tached
to the
However, it can be shown that there are many situations Where the proposed
rank order tests are more efficient (asymptotically) than the Wilks' L test.
MV
an example, we may consider the case, where
-e
where both F[l] and F[2J are cauchy cdf's with appropriate cdf's.
In this extreme
case, the efficiency of the normal score test with respect to L test will be
MV
3~
As
:'ll'-l:;.:ttJ.,':'.h~~· .Le..r.clt;;; . ~r.
'the")t~1er
·:me
blem.
Wi:J.J.
incQn~is..tE:J:lt :'.l·
iJ€;
'ibis
~as~) ~tii ..... p.
On some asymptotically nonparametric corupetitors of Hotelling's
Ann. Math. Stat.
CHATTERJEE,
t~fi.L·
will be reasonably efficient.
REFERENCES
BICKEL, F. (1965):
rI.
m.
fe,\.-;..; .. L
s.
K. (1966):
~
l60-l'"{3.
A nonparametric test for the several sample scale pr':-
To be published.
CHATTERJEE, S. K., and SEN, P. K. (1964):
twlh sample location problem.
Nonparametric tests for the bivariate
Calcutta. Stat. Asso.
CHERNOFF, H., and SAVAGE, I. R. (1958):
B~.
ll,
18-58.
Asymptotic normality and efficiency
certain nonparametric test statistics.
~f
Ann. Math. Stat. ~ 972-994.
GOVINDARAJULU, Z. (1960): Central limit theorems ana asymptotic efficiency for ~n~
sample nonparametric procedures.
Tech. Rep. No. 11. Dept. of Stat. Univ. of
Minnesota.
GOVINDARAJULU, Z., LECAM, L., and RAGHAVACHARI, M. (1965):
Generalizations of
Chernoff-Savage theorems on asymptotic normality of nonparametric test statistics.
Froe. Fifth Berkeley Sysp. Math. Stat. Froc. (in press).
HODGES, J. L. Jr, and LEIINAtm, E. L. (1962):
HOEFFDING, W. (1948):
bution.
ltLOTZ., J (1962):
Ann. Math. Stat. ~ 482-491
On a class of statistics with asymptotically normal distri-
Ann. Math. Stat.
12t
293-325.
Nonparametric tests for scale.
LEHMANN, E. L., and STEIN,
c.
••I
I
I
I
I
I
I
-1'1
I1
Ii
II
t
Rank methods for conbination of in-
dependent experiments in analysis of variance.
I
Ann. Math. Stat. ~, 498-512.
(1949): On the theory of some nonparametric hypotheses.
1
,
II
I
I~
Ann. Math. Stat.
LOEVE, M. tJ.963) Probabi-lity The:Jr¥. D. Van Nostrand Co. Princeton, N. J.
MANN, H. B., and WALD, A. (1942):
en the choice of the number of class intervals
II
I;;
,
in tIle application of the cili square test.
MOOD) A. M. (1950):
Ann. Math. stat.
~
306-317.
Introduction to the lfheory of Statistics. Me Graw Hill. New
Yo:r;-k.
;6
-.
I,
I
I.
I
I
I
I
I
I
I
Ie
I
MOOD, A. }L (1954):
On the asymptotic efficiency of certain nonparametric tests.
gz,
Ann. Math. Stat.
MORGAN, W. A. (1939):
514"522.
A test for the significance of the difference between the
two variances in a sample from a normal bivariate population.
2.b
Biometrika
13-19.
PURl, M. L (1964): Asynq>totic efficiency of a class of c-samp1e tests.
~. ~,
102-12l.
PURl, M. L., and SEN, P. K. (1966):
order tests.
Math. Stat.
On
a class of multivariate multisamp1e rank
Sankhya (Submitted).
RAGHAVACHARI, M. (1965) :
~
SEN, P. K. (1963):
~.
Ann. Math.
Two sample scale problem when locations are unknown.
Ann.
1236-1242.
On weighted rank sum tests for dispersion.
Ann. Inst. Stat.
1,2., 17-135.
SUKHATME, B. V. (1958):
location.
Testing the hypothesis that two pmpulations differ only in
Ann. Math. Stat. ~ 60-67.
WILCOXON, F. (1949):
Some rapod approximate statistical procedures.
American
Cynamid Co. Stanford, Conn.
~,
S. S. (1946):
Sample criteria for testing equality of means, equality of
variances, and equality of covariances in a normal multivariate distribution.
Ann. Math. Stat.
11" 257-281.
I
I
I
I
~e
"!!!II
37