Download 889 Part B

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Statistics wikipedia , lookup

Transcript
THE:pQWER OF THE; LrKEtIHOOD RATIO TES':\'.'
O;F ;LOCATION IN NONJ;.Il'lEAR REGJ.1ESS+ON MomIB
by
A. R. G1\1Uo\NT
Institute of
Stat1~tios
Mim~ogfaph S~ri~s ~o.
Ra~eigh ~Septe~b~r
889
1973
If
'l'1la :power of the Likelihood.:Ratio Test
Of Location
by
ABSTBACT
~e
Likelihood
agaiIlst>Az
e ~ eo
.thenonlin~af modal
~atio
Test statistic T for the
hypoth~si~H:
e :;: ea
~sconsideI'edwhen theda.ta are generatedaccord,ing to
y::
t(x,8)
+ a
X .:j.sQbtaill.ec:lsuchtnat n.(T-X)
with variance unknown.
converges;in p-robabilitytazero; the
distril;)ution function of' X is derived assuming nonnalerrors.
Ratio testis tabulated for selected sample
~e poweroftneLi~eJ.:ihood
sizes and selected departures f;roomthe null hypothesis by using the distribution
functlonot X to apprQximate the distributiont'unction of T.
Monte-Carlo
~ower
estimates for an exponential model are compared to power points calculated
using
thisap~roximation to
gain a feel for the adequacy Of the
* Assi$tantProfessorof ~conomics and Statistics,
~orthCa;-olina state
University,
1.
INTRODUCTION
e == eo
against A:
This paper considers the
H:
a
level of significance when the data are responses
;Kt generated according to the nonlinear regression model
(t == 1, 2, ••• ,n).
The unknown parameter
e
is known to be contained
which is a subset of the p-dimensional reals.
The inputs
is a subset of the k-dimensionalreals.
;Xt
The errors
et
and normally distributed with mean zero
The Likelihood Ratio test and the large sample
. statistic are obtained in Section 3.
tabulated at selected departures
sizes in Section 4.
Monte-Carlo estimateEiof power are compared
with the large sample values for an exponential model in Section 5 in order. to
gain a feel for the adequacy of the large samPle approximation
Bection 6 contains summary and concluding remarks.
The results presented in this paper are for the
'-S
known, the large sample distribution of the
is obtained, but not tabulated, in [2;3J.
2.
NOTATION AND ASSUMPTXONS
following notation will be useful
. Notation:
Given the regression model
(t ==
2· .
where
e€
(1 C
RP;
the observations
(t =1, 2, ••• ,
n),
and the hypothesis of location
H:
e = eo
against
A:
8 j 80
we define:
••• , y ) '
n
(n X 1),
(n X 1),
(n Xl),
..
th
'i7f(x, e) = the p Xl vector whosej
element is
F(e) = then X p matrix: whose t th row is
p=
F(e)[F'Ce)F(e)r1F'(8)·
(n X n),
6 = f(.e) - f( e )
o
(n X 1),
(n X n) ,
Ie
I
i·
g(t;V,A)
= the
non-central chi-squared density function with
degrees freedom and non-centrality
G(X;V,A)
A [4,p. 74],
= r~ g(t;v'A)dt ,
= the normal density function with mean
2
N(x;u, cr)
p(i,A)
v
•. x
=J
-GO
~nd
U
2
variance· 0,
2
n(t;/.t' CT )dt,
= the Poisson density function with mean A'
In order to obtain asymptotic results, it is necessary to specify the
behavior of the inputs
x
as n becomes large. A general way of specifying
t
the limiting behavior of nonlinear regression inputs
. Malinvaud's defini tion~ are repeated below for the readers convenience ;
complete discussion and examples are contained in his paper•.
Definition.
Let
a
be the Borel subsets of X and
of inputs chosen from X.
A of X.
The measure
Let
/.t
n
on
IA(x)
A sequence of measures
(x,a) is defined by
u
on
(x,a)
~n}
be the
be the indicator function of
each A Ea.
Definition.
GO
[Xt}t=l
on
if for every real valued,
function
as
n
-+
g with domain X
co •
The assumptions below are used to obtain the
of the Likelihood Ratio test statistic.
Assurn.ptions.
The parameter space
n
of the p-dimensionaland k-dimensional reals, respectively.
function
2
f(x,e)
o
and the partial derivatives
08. f(x,e)
and
~
.
Oei~ej f(x, e) are continuous on
X X
n.
(x,a)
co
The sequence' of inputs . .• tXt }t=l
are chosen such that the sequence of measures
measure U defined over
The
Vtn}:=l converges weakly to a
The true value of
8, denoted by
contained in an open set which, in turn, is contained in
n·
o
e,
If
except· on a set of U measure zero, it is assumedthate = eO.
matrix
t
J O~.
= [
~
is non-singular.
density
f(x, eO)o~.· f(x, eO)]
.
J
As mentioned earlier, the errors
2
n{x;0,a J where
2
a
ret}
is non-zero, finite,
Theseassurn.ptions are patterned after those used by Malinvaud
that the MaximUm Likelihood (least squares) estimator is consistent.
~ddition,
it can be shown [:2;3] under these assurn.ptions that a measurable
,..
e(y)
Jii. (~(y)
.'
.
- eO)
minimizing
1
(y- fCe» '(y- f(e»
over
n
is asymptotically normally distributed with
:2 ,..-1
a....
.
The following theorem is proved in [2; 3] •
Theorem 1.
consistent for
"'2
Under the assumptions listed above, the estimator
a(y)
is
a2 and is characterized by
"'2()
/
cry = .,.L
? Pen
+an
where
at
e
n.an converges in probability to zero.
7
eO,
the true parameter value.
is non-singular for all
(The matrix P is evaluated
There is. anN
such that
F'(eo)F(eo)
n > N.)
Assumptions which allow 0
to be an unbounded set and do not
the second partial derivatives off(x,e)
re~uire
exist yet are sufficient for the
conclusion of Theorem 1 are given in [21.
3•. LARGE SAMPLE DISTRIBUTION OF THE LIKELIHOOD RATIO TEST STATISTIC
The Likelihood of the sample
n
L(y; 0, ( )=. (21fa2 }-2'
2
y
is
exp[~ ~. a- 2 (y
-f(e» '(y- f(e»} •
The Max;tmum Likelihood estimators under Ha.re 8'=8
n-l(y . . f(Oo»)',(y -
~(80»;
and cr(y) ='
0
over the entire parameter space they are
(y-f(e»'(y- f(e»
. "'2
(j
(y)
= n -1 (y
'"
over 0
"'.
- f(e(y»)'(y - f(e(y»)
and
= infO
-1
n (y - f(e»
is, therefore,
.2 <00 }
O<a
0<
2
a<
oo}
Likelih09dRatio test has the form:
when
=
reject
,
(y- f(O)}
that
"'--2
T(y)
= i;W..
2
o(y)
is larger than
c
where
pETey) > c
I.
8= 80 ]
=
C(, •
The following lemma is needed to prove the main result of tlJis section.
Lemma 1.
Under the Assumptions listed in Section 2
"'2
J.
1/0 (y) = n/e'P e + b
where
n.b
n
Proof.
Let
n
.
>N
.
since
converges in probability to zero.
Choose
and
6 >0
(J
(y)
€
>
P(K)
implies
"'2
n
and
X
Theorem 2.
and
<
cP
such that
0
0
begiven.
By Theorem 1, there is an
~
T
:>
l
I ..
an
e'~e/n converge in probability to
By Taylors theorem, for
Thus,
e
€
be as in Theorem 1.
N' such that
e
in
n.a
n
converges
K
n
K
n
n> N imply
Under the Assumptions listed in Section 2 the Likelihood Ratio.
statistic may be characterized by
:,0._·····
and let
1-6 where
between 0 and 1.
Rh
<
T
in probability to zero.
for some
n
T(Y) = X +c n
7
where
n.c
n
converges in probability to zero and the distribution function·
of X is:
x -< 1, A2
0,
Proof.
T(y)
==
0,
By the preceeding Lemma
==
(y - fee »)'(y - fee »/elF-e + b (y - f(A »
°
0
n'o
'ey - fee 0 »/n
==(e+6) l(e+6)/e / p .L e + b (e+6) '(e+6)/n
n
== X + c
~lhere
n
== n.bn(e'e/n + 2 o'ein + o/o/n)
n
converges in probability to zero. The term
0 is evaluated at
andn. b
n
probability to
(i
eO.
Now n.c
by the Strong Law of Large Numbers.
The term2o'e/n
_ f(x,e )}2
o
is a
continuous function of xwe have
o'o/n
==
.
n ....
CXl
J {:rex, eO) -
f(x, e ) J2dJL (x). 0
n
J. [f(x, eO)
by the weak convergence· of the measures IJ··
and 2o'e/n
n
- f(x,8 ) )2dJL (x)
°
•
Thus,
Var(2 o'e/n) - 0
converges in probability to zero by Chebysheff's inequality.
last term o'o/n
converges to a finite constant as shown above.
converges in probability to zero.
,
Thus,
Set
z =
a1 e
a1 6
and r :::
are independent with density
random ,variable
The random variables
n[t;O,l}.
(z+ ay)/p(Z + ay)
(zl' z2"'"
For an arbitrary constant
a,
is a non-central Chi-squared with
degrees freedom and non-centrality
aSy/Py/2
since
P
the
p
is idempotent with
Similarly, (z+ by) I p .1.(Z + by) is a non-central Chi-squared with
.
.'
·2.1.
degrees freedom and non-centrality b rIp r!2. These two random variables
rank p.
n-p
.1.
pp
are independent because
Let
a>
= 0;
see Graybill [4, p. 74ff].
O.
P[X> a + 1] = p[(Z+;,)/(Z+;') > (a+l)z/ P.1. z ]
)
::: p[ ( z-t')' ) . /
P(
Z-t')'
:::
J0
co
[.
p t
> az 1.1.
P Z
2y 1.1.
p z - r I.LJ
:P r
> .a ( z-a -1)
r I p.1. ( z-a -1)
r
x g(t;p,r 'Py!2)
dt
x g( t;p,r IPy! 2)
dt
x g(t;p,i ' Py!2)
dt
Al = rIp r/2,
Bysubstitutingx= a+l,
-
form of the distribution function for
x
and
>
1.
The derivations for the remaining cases are analogous
The large sample approximation of the critical point
*and
. 'defined
c
by
..
> c * 16=0 ]=a.
P[X
where
c
will be denoted
X is as in Theorem 2.
9
c* can be obtained from a table of
point
F as follows.
When
6=0
p[X> c* ] = P[e/Pe!.L
e'P e > c*-1] = P[F > (n-p)(c*-l)!p] •
Let
F
a.
denote the
w 100
percentage point of an F random variable
with p numerator degrees freedom and n-p denominator degrees freedom;
c* is given by
c*
Note that is
0 < a.< 1
that
c*
~
=1
+ PFj(n-p ) •
P[X >c* ] = 1 when. 6=0.
1 then
c* >1
and hence that
It is assumed
throughout the rest of the
paper.
4.
PARTIAL TABULATION OF TEE POWER FUNCTION
The conclusion of Theorem 2 states that
probability to zero as
n tends to infinity.
rapid approach of the difference
that the probabilityP[X > t]
peT >t]
n. (T-X)
This is a
T'-X to zero which leads one to
would be
even in small samples.
*
The probabilityp[X > c*
] with
c
chosen for
tabulated in Tables 1 through 9 for values of p,
thought to be representative
n,
a. =
Al
and A2
of those occurring most frequently in
applications.
.. *
The details of the numerical evaluation of p[X> c ]
The density g(t;v,A)
maybe put in the form
g(t;v,A) = t~=oP(i;A) g(t;v+2i,O) •
[4, p. 76]
·
.
.
,.'....
"'.
.
'-"to "'."~ :::.-: ,,::--:-". "i< .~~ •••••;".:,..... ....~""...:... " ....>, ...'~..;;... ...... ';.... ."'.;..;~...;...".;,....""....,;,;..,.,."""".:....,....:..,."w,.....
e
[,
__ _ __ _-- _ _-_ _ __._-_
._--'--.---_._-----_.
•.
..
..
..
- ..
~-:--_.
","';
.
'.',;,-,-r~.'-
r' ...... ',"'. , •.,,-- .., .•
,1"'"
.... ~"':.;•.•,_.:.
-2.
Power:
-"':~"'.
".~""''''
'
·e
. c* = 1.30;93
"Q
p=3, Ii=30,
~1
A.ri
Co.
....
''''''':'--'._'-''---~
o
·5
10
12
·585
.692
·778
.892
·951
·979
.456
·585
.692
·778
.892
•951
·979
·314
• 457
·585
.692
·778
.892
·951
·979
.173
·316
.458
·586
.693
·779
.892
·951
·979
.185
·330
.472
. 596 .
·703
·786
.896
·953
. ·980
·314
.456
.0001
.050
.106
.171
·314
.001
.050
.106
.171
.01
.051
.1(JJ
.1
.058
•118
_
.""-~
~_
..
_~
~I._
. ... .
.
.......,."-.""-_
~
~
~
=
..... .,..."'"
8
.171
~
.
"".~ .:~
6
3
.106
. 0.0
_-_ 4
..........,......,..':.."..
5
2
1
-.050
...
~ " _ '"-.,..,...
"-.,.",
...
..."'"-...
~_
..
_~
....,'.......-,,............-....... ......-....
.••_ _•_ _,........
,~-._~_"..O-:.,..
......
.",...,...-:M..:--,....-.~
.._ .......,."..".'"..,.,..,......,.,....,.".._..
•.-:t.=.:".. -=--.'.... .....".=---.....
~~
~
~~
............,..•"-..-.....--'""_.....:.:.;
,'.' .... ~.,
:_~,.~
....... ~
'~.'1~·~'~~
.:.. ,.~.~~~:~~~~'-:~~'-:~:'1.:~~,
..".~~r;,::"",,,,-.~..: _.'.~~~~.,~,:~ .... ~:.::.,.' ,:""~'~~:=:~:~':-=1:::::::':'::'::::'~:':;,~2.:.='·~"'~:.::"::"::.·;",,,.~~,
,::;., ~.':.::'~~'--.'- . . '-_. ~- .,-_. ... .~.-.,
e
3.
A2
e'
Power: p=5, n=30, c*. = 1·52060
Al
.
._-.........
0
. ·5
1
2
3
4
.050
.089
.134
.239
·353
.465
.0001
.050
.089
.134
.239
·353
.001
.050'
.089
.134
.239
.01 .
.051
.089
.135
.1
.056
.ogr
.144
0.0
.5
6
8
10
. ·569
.661
~802
.892
·944
.465
·569
.661
.802
.892
·944
·353
.465
·569
.661
.802·
.892
·944
.240
.354
.466
·570
.662
.803
.892
·944
.251
·365
.477
·530
.670
.808
.895
·946
._---_.
'.""',.,:",,;~:
.., ,'.':, .:'."".,,-,......:
. "'." ... :,:':\~~,
~ '~
.. ",""
~
~.
~,":""
.. ,.,.....
~.~
••
"'.~
"!"" ... .,... ,........._ .• ~_., •.", .... ,.
~. ,~"'
,..... ,".'-, ,.,
....... ~;I"I'
-4.
Power:
p=2, n=60, c* = 1.10897
Al
4
8
10
12
·947
·980
·993
.050
.129
.218
·399
·563
.695
·795
6
..866
.0001
.050
.129
.2;1.8
·399
·563
.695
·795
.866
·947
·980
·993
.001
.050
.129
.218
·399
·563
.695
·795
.866
·947
·980
·993
.01
.051
.130
.219
.401
·564
.696
·796
.866
·947
·930
·993
...
.1
.060
.144
.235
.417
•578
·707
.803
.871
·948
·981
·993 .
0
. 0.0
·5
-
2
1
3
-
5
~
...
• ,- •••• ~.;j;', •.:" •••...." ••.•.
<'., ',",: .,-'
,.>,,......,':..,.
..'.••.
':'.~>'"
.. -,..,.,,, ....,:.': .:",'"
.
~"' '-"'~',
"';" ,.,' S·-"."
- • - ...,
"~"
•••. , .....' ,....,:' ...... -',
._~.
_ ....__ "·'.n....... ~, ......
e
..
5.
Power:p=3, n=60, c*. = 1.14532
Al
A2
0
·5
1
2
3
4
5
6
8
10
12
.050
.111
.182
·337
.488
.622
·730
.813
·916
·966
·987
.0001
.050
.111
.182
·337
.488
.622
·730
.813
·916
·966
·987
.001
.050
.111
.182
·337
.489
.62.2
·730
.813
·916
·966
·987
.01
.051
.112
.183
·338
.490
.623
·731
.813
·917
·966
·987
.1
.059
.124
.198
·355
·505
.636
·740
.820
·920
·967
·987
0.0
."..,
•....,.--.-':"
_i-e
6.
Power:
p=5, n=60, c* =1.21664
_
. . ..., ••••••. -j,., ••, . ' -
' •• :-
"~"
-
':."'.:~-',
.:-., ,
"
.•.,.. "
, ,.,
~
'.'
-'
..
,~ ~
"
-.,._.."
e·
7.
Power:p=2, n=120, c* = 1.05337
_
,,'
~""",,,~.
'J: ••·7-.''' ......:,· .... ~'.~·.
: ..., \ .... ,~"'_."~
"
8.
. Power:.
.*.
= 1.06762
p=3, n=l2Q" c
•.. ,-•..,..
.'• •,:'.",·•.•''I'o.w
..:.
"
,jo.·.~_.''''l''.,..
•
__._ _ ·.~
•. , .•• j~:
~....--
. . .·"i·:-·
·_
0',._
•
.'
I.
I
* = 1.09925
C
'J~""""'";'''
-.
.,. •. ,,,;.:..
.~"A I .....: ........'"••,.".;,.·;·~-',i,,·.,....:. ,:..._"".... ...;,.~,~" ......;,...;._ _.....'-'...._ ..- ..,_.,.....;,..'..:.~...,;,.
.-
Using this expression and rearranging terms
00
/
* + 2c*...)\'Z[c
/ *-1]. 2 ; n-p +2i,
.X·.·.oG(t[c-l]
J
This expression was evaluated on an IBM 370/165 using the IBM Scientific
Subroutine Package [5] subroutines DLG.A!I1:, CDI'R, 'and DQ,I.J.6.
A l;isting of the
authors program is available on request.
5.
A MONTE-CARLO COMPARISON
In an effort to gain a feel for the adequacy of the large sample approximation
in samples of moderate size,
the model
Thirty inputs were chosen from X
three times and the points
n
= [0,1] X [0,1]
= [0,1]
.8(.1)1 twice•. The parameterspacews.s
and the null hypothesis as H:
8 =(1/2, 1/2).
0
null hypothesis and selected departures from the null hypothesis, five
thousand random: samples were generatedaccoI'ding to the m.~del withrl,.
,... . *
Thepoint.estimatep ofP[T > c ]
T exceeded c* to five
was estimated by
,..
Var(p)
= p[x> c* ]
•
The results are presented in Table 10.
Certain "points should be mentioned about
el=
f'or
eo
in the Monte;'Carlostudy.
1
1
e FI ('2'
'2)
tion of Al
of' the form
and A
2
:!:. r(cos(~ re),
sin(~
over
The ratioA!Al
1
(8 l ,'2!
n,
maximized for
f'orm, and two of' the latter form.
power when
and, based on a nurnericalevalua8 of the form
1
1
(-,
-)
2 "2
Three points were chosen to be of "the f'irst
ren.
paired with respect to
is minimized
Al •
Further,tvTo sets of points were
This was done to evaluate the variation in
A changes while
2
Al
is
6.
REMARKS
Considering the standard errors of the Monte-Carlo estime,tes of'
p[ T > c* ],
the
" .support
"
est~mates
power in this instance.
the use of
p[X
*
> c]
to approximate
Generalizations beyond this
the usual risks of' generalizing from Monte-Carlo studies.
In most applications,
the Monte-Carlo study.
computed with A =
2
o
A will be quite small relative
2
This being the case, a value
would be adequate.
Note
p[X >c*] = P[F' > F ]
a.
denotes anon-centralF with p
freedom,
denominator degrees freedom, and non-centrality A [4, p. 77-78].
l
words, the firs"trow of Tables 1 through 9 are a tabulation of'thepower
10.· Monte-Carlo Power Estimates
Parameters
8
82
1
Non-centralities
,
*
1..1
1.. 2
p[x> c ]
·5
·5
0
0
.050
.0532
·5398
·5
·9854
0
.204
.2058
.4237
.6949
·9853
.00034
.204
.2114
·5856
·5
4·556
0
·7'27
·7140
·3473
.8697
4·556
.00537
·728-
·7312
·5
8·,958
0
·957
·9530
"c
.00630
12
F-test.
Thus, in most applications, an adequate indication of the power of
the Likelihood Ratio test can be obtained from charts of the power of the
F-test such as [1] and [7].
One last point might be mentioned.
*
C
::
1 + P Fj (n-p)
To reject
is equ.ivalent to rejecting
H
H when T(y)
when
S(y)
exceeds
exceeds
F
ex.
. where
r-.I,2
S(y) ::
..
"'2
.
[cr (y) - cr (y) lip
X
2
cr (y)1 (n-p)
This·form of the Likelihood Ratio test is analagousto theF-test
linear regression and can be compared directly with tabled F-test
points.
A: e 1=
As stated above, the behavior of S(y)
eO will be adequately approximated in most applications by a non-central
F with
P
numerator degrees freedom,
non-centrality
of the
under the alternative
Xl'
The size of
approximation~
A2
n-p
denominator degrees freedom, and
willgive an indication of the adequacy
REFERENCES
[1]
Fox, M. ,Charts
of the Power of the F-test, " The Annals of Mathematical
"
Statistics, 27 (June, 1956), 484-97.
[2]
Gallant, A. R., "Inference for Nonlinear Models," Institute
Mimeo Series No. 875, Raleigh, 1973.
----.-------
[3]
Gallant, A.R., "Statistical Inference for Nonlinear Regression Models,"
Ph.D. dissertation, Iowa State University, 1971.
[4]
Graybill, F. A., An Introduction to Linear Statistical Models, Vol. I,
New York: McGraw-Hill, 1951.
[5]
IBM Corporation, System 1360 Scientific Subroutine Package, Version III,
White Plains, New York: IR.'1 Corporation, Technical Publications
Department, 1968.
Malinvaud, E., "The Consistency of NonlinearRegressions, " The Annals of
Mathematical Statistics, 41 (June, 1970), 956-69.
Pearson, E. S. and Hartley, H. 0., "Charts of the Power Function of the
Analysis·ofVariance Tests, Derived from the Non-central F-distribution,"
Biometrika, 38(1951), 112-30•
..
. ~--
... -
.
..
-
-~