Download Gerig, Thomas and Guillermo P. Zarate-De-Lara; (1976)Estimation in linear models using directionally minimax mean squared error."

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Choice modelling wikipedia , lookup

German tank problem wikipedia , lookup

Regression toward the mean wikipedia , lookup

Regression analysis wikipedia , lookup

Robust statistics wikipedia , lookup

Maximum likelihood estimation wikipedia , lookup

Coefficient of determination wikipedia , lookup

Linear regression wikipedia , lookup

Transcript
•
ESTIMATION IN LINEAR MODELS USING DIRECTIONALLY MINIMAX
MEAN SQUARED ERROR
.
by
;:.
THOMAS M. GERIG and GUILLERMO P. ZARATE-DE-LARA
.}
Institute of Statistics
Mimeograph Series No. 1084
Raleigh, N.C. - August 1976
•
1
1.
INTRODUCTION
The most commonly used estimators of the regression coefficients
in general linear models are the best linear unbiased estimators (or
ordinary least squares estimators).
By dropping the unbiasedness
criterion, linear estimators can be obtained that have smaller
variances.
In this situation, it is common to adopt Some type of mean
squared error criterion which balances increased bias against reduced
variance.
Unfortunately, estimators derived from this type of criterion
invariably lead to expressions that involve the parameters themselves
and, as such, are useless in practice.
Several authors have proposed classes of biased regression
estimators which can be used as alternatives to the ordinary least
squares (01S) estimator.
These estimators are studied and compared to
018 estimators with respect to mean squared error.
Typically, they are
not uniformly better in this respect but in some situations can be
dramatically superior.
Some of the common methods of biased regression
estima tion include Hoerl and Kennard's [3] Ripge regression, Marquardt's
[4] Generalized Inverse regression, Mayer and Willke's
01S regression, and Taro and Wallace's
regression.
[~J
Shrunken
[7l False Restrictions
Many others have proposed modifications and extent ions
to these.
The purpose of this paper is to develop a meaningful criterion
which can be used to derive possibly biased estimators which will be
superior to 01S with 'respect to mean squared error.
In Section 3, a
criterion is proposed which is based on a minimax argument.
In the
formula for the trace of the mean squared error matrix, the value of
2
the regression parameters which are, in a sense, the least favorable
to estimation (that is, which result in the greatest bias) are substituted for the unknown parameters.
By this, it is hoped that the
derived estimators will protect the user from the worst,case that
nature can produce.
In Section 4, this criterion is applied to the class of estimators
obtained by computing OLS estimators subject to possibly false restrictions.
The criterion is minimized by choice of a prespecified
number of independent restrictions.
A statistical test is given which
can be used to decide whether the resulting estimator is superior in
mean squared error to OLS.
The resulting estimators turn out to be
equiva lent to Marquardt's [4] Generalized Inverse estimators.
Section 5 is devoted to the study of the class of Shrinkage
estimators defined by Goldstein and Smith [2J.
Within this class the
optimum estimator turns out to a member of the subclass of Shrunken
OLS estimators defined by Mayer and Wi11ke: [5J.
2.
NOTATION
Consider the linear model
(2.1)
y=XI3+e,
where
y
is an
matrix of rank
(n x 1)
m(~
n)
vector of observations,
of known constants,
of unknown regression parameters, and
e
observable random variables with
=0
E(e)
S
is an
and
X is an
is an
(n x m)
(m x 1)
vector
(n x 1) vector of unE ( ee ')
= L:c 2 ,
•
3
Define
s = XIX
(2.2)
and
(2.3)
and write the singular value decomposition of
X
U is
where
~
= U/\ l/2V '
is the
,
(n x m) ,
1/2
Am > 0,
V is
(m x m)
(2.4)
/\
1/2 _
-
(m x m) ,
d'
(1/ 2 ••• , AliZ ) ,
Al '
m
U'U
= Im
~ag
identity matrix.
s = V/\V'
X as
'
V'V
1/2
Al
= VV' = I m
~
,and
...
Im
From this we write
.
(Z.5)
Define the partitioned matri:ces
U
= (U l :UZ) ,
(Z.6)
V
=
(2.7)
and
VI
where
and
U2
is
is
/\
(V l :V 2 )
Vz is
(m x m - u) ,
(n x u)
1/2
for some
.
•
=
u.
Ul
is
(n x m - u) ,
Similarly, define
o
••••••••••• $
o
(m x u) ,
1/2
/\2
(2.8)
4
where
[(m - u) x (m - u)l
is
and
1/2
AZ
is
for some
(u'x u)
u •
We shall be concerned with the simultaneous estimation of the re-
S, using linear functions of the observations
gression parameters,
Ay + b , where
A
is some
(m x m)
matrix and
b
an
(m x 1)
vector.
For this we shall need the mean squared error matrix:
M = M(Ay + b, (3) = E[ (Ay + b - 13 HAy + b - (3)' J
where
F
=I
- AX.
Also we shall need the trace of
T(Ay + b, 13)
=
M
9
tr[M(Ay + b, S)] •
Fina lly, define
x'x
(i)
(ii)
C~(A)
(iii)
C(A)
, for any vector
x,
is the largest characteristic root: of the matrix
is the vector space generated by the columns of
3.
A,
A •
DIRECTIONALLY MINIMAX TRACE MEAN SQUARED
ERROR ESTIMATORS
13
The goal of this work is to obtain an estimator of
which does
not depend on the parameters themselves and which, in Some sense,
minimizes the trace mean squared error,
T(Ay + b, (3) •
minimax philosophy leads to attempting to reRlace
which maximizes
T(Ay + b, 6) •
is unboundedly increasing with
13
Adopting a
by the value
This, of course, is not helpful as
\\13 \I •
The modification adoped
I
~n
T
this
•
5
~
work is to express the parameters in the form
its direction cosines and
k
its length.
T(Ay + b,
k , the expression,
~)
k.
~ are
,where
Then, for fixed values of
, can be maximized by choice of
By Theorem 3.1 below, the ultimate choice of
value of
=~
S
~
,
is independent of the
~
This fact indicates that, emanating from the origin,
there is one direction corresponding to the worst choice of 13
with
respect to mean sq1.iared errQr(or"equ.ivalently; squared bias).
It is
from along this ray that
t~e
13
value of
is chosen to maximize
Ihe exact location on the ray is set by choice of
Definition 3.1:
A linear estimator
~
= Ay +
b
T.
k.
is said to be
directionally minimax trace mean squared error (DMTMSE) estimator of
S if A and b are such that they minimize
S'T(A, b, k)
=
Sup T(Ay + b, 13)
SE:~k
= r:J 2 treAD\. I) +
Sup (b + FS) I (b + FS)
Se~k
where
8k
= (S Is = kci,
Theorem 3.1:
II~II
= 1}
.
2
(y, XS, .L:a ) ,
In the linear model
for any A , and in particular, the DMTMSE estimator is of the form
Ay .
Proof:
Either
IlF,6 + bl1
2
~
IlFsl12
or
former holds, then i t follows that
1!-Fa + bl1
2
= \IF"~112 +
IIbl12 - 2b'FS
IIFS + bll
IIbl12
~
I\FS1I
2
~
IIF13112.
Suppose the
~ -2b'F,6 and hence,
2
+ 211bl12 ~ IIFsI1
2
•
Thus,
6
Now
Supll$ + bl1
2
2
Sup(max[IIFS +bI1 ,
=
SeiB ·
II-Fa +bI\2 J}
SeiB k
k
11:$11 2
;;: sup
,
SeiBk
with equality when
all
A
and
b
=0
Therefore,
ST(A,b,k);;: ST(A,O,k) , for
k •
In view of Theorem 3.1, the criterion can be reduced to:
2
ST(Ay, k) = a tr(Ar,A') +
Sup [3 'F'FS
13 eiBk
(3.1)
where
F
=I
4.
- AX •
DMTMSE ESTIMATION IN THE ClASS OF OLS ESTIMATORS
COMPUTED SUBJECT TO FALSE RESTRICTIONS
One class of biased linear estimators of regression coefficients
that has been proposed is that obtained by computing least squares
estimates under sets of false restrictions.
studied by Toro and Wallace
These estimators were
[7J. Their work is primarily concerned
with testing whether or not a particular set of false restrictions will
lead to estimators with smaller mean squared error.
They leave the
choice of restrictions up to the experimenter.
For this case, we shall restrict our attention to the linear
model (y,XS,ltj2)
and assume that
X
is
of
full rank
To obtain the class of estimators we shall impose
u
(rank(X)
= m)
possibly false
•
•
•
7
restrictions on the parameters:
RS=h,
where
R
vector.
is a
(u x m)
matrix of rank
u«
m)
and
h
is a
(u x 1)
So that the equations are consistent, we shall assume that
h e C(R)
It is well known (see, for example, Pringle and Rayner [6]) that
under this setup the estimator is of the form
where
S
is given by (2.2).
Theorem 4.1:
In the class of least squares estimators subject to
XS,
2
Ia ) , restrictions
with respect to
DMTMSE estimation.
possibly false restrictions for the setup
=h
(y,
RS = 0
are prefered over
Proof:
The theorem follows from Theorem 3.1 since, for
Ra
estimator will be of the form Ay + b , where A
and
equals
b
0
whenever
h
=0
RS = h , the
is independent of
h
0
In view of Theorem 4.1, attention will be limited to the class of
least squares estimators obtained subject to Ri3 = 0 •
Theorem 4.2:
The class of estimators,
squares subject to constraints
RS
=0
~(R) , obtained by least
,where
of rank
u}
is equivalent to that where
rank
and
RS
u
-1
Rt
= 11 ,
where
s
R e
R e (RIR
(RIR
is
is given by (2.2).
is
(u x m)
(u x m)
of
8
Proof:
rows of
Since
A and
2
A1 / y
form a basis for m-Euc1idean space and, hence,
be written as
u ~
rank(R)
and
IGI
BB'
*0
G- 1R
~
R
= BA 1 / 2y'
rank(B)
~
, for Some
u,
matrix
B.
1
RS- R'
= GG'
Thus, if we let
R can
Since
Now,
, it is easily seen that
=R ,
(u x m)
B must have full row rank.
is positive definite.
,
~(R) = ~(G-1R) • Also, if we let
we have
Thus, for every
S (R)
Y given in (2.4) are positive definite, the
R
"'"
an~
= "'"S (R)
there exists a corresponding "'"
R such that
as- 1i'
=
I •
Using Theorem 4.2 we can respecify the class of estimators of
interest as
for all
R: such that RS- 1R'
the optimum with
respe~t
= I.
Within this class we wish to find
to the DMTMSE criterion set out in Section 3.
It should be noted that the value of
fixed.
u
is assumed to be given and
Also, because of the assumption that
R
is of full row rank,
the ordinary least squares (OLS) estimator is not in the class.
However,
there exist members of the class that are arbitrarily "close" to the
OLS estimator.
Subsections 4.1 and 4.2 below deal with deriving the optimum
estimator for
u
=1
and for general
u, respectively, and subsection
4.3 examines Some properties of the resulting estimators.
•
9
~
4.1
The Single Constraint Case
The first case we shall consider is when
u
= 1.
the set of restrictions being imposed are of the form
r
is an
vector satisfying
(m x 1)
By putting A
ChM(F'F)
r'S
- S~l~!')S-lX'
= (I
~1
- S
rr'S
r's
=0
, where
r = 1
in (3.1), and since
= ChM(rr'S -2 rr') = r'rr'S -2 r
=a 2tr[S -1
-1
In this case
we get
-1
- S
-1
rr'S
-1-1
+ S rr'S -1 rr'S -1 J
+ k 2 r'rr'S -2 r
= a 2 t r[ S-1 J
=a
m
2
~
2
2-2
+ (k r t r - a ) r ' S r
-1
2
2-2
Ai + (k r'r - a )r'S r ,
i=l
where
A = diag(A1' A2' ••• , Am)
and
A is defined in (2.4) and (2.5).
The following theorem gives the vector
r
which minimizes
Theorem 4.3:
with equality when
~
is defined in (2.4).
r
=
A1/ 2 V
m
m
,where
and
V
That is, the DMTMSE estimator in the class of least
10
squares estimators computed subject to one possible incorrect re-
a = (I - VmmV')S-lX'y
striction is
Proof:
implies
Let
w = A-
w'w = 1.
1/2
Vir
so that
r = VA
Also,
Now
m
2 m ~:1 + (k2
=a
r: ~
r:
i=l
i=l
= a
m
2
-1
r:
- a
~
2
m
(~
i=l
2
w.~.
~
~
2 m
r:
. 1
~=
- a )
2
w.~.)(
~ ~
m
r:
1=1
2-1
w.~.
~
~
2-1
w.~.
~ ~
)
2-1
m
2
+ k
~i
i=l
Since
1/2
w
. 1
~=
(4.1)
w.~.
~
~
2
m
r: w. = 1 and since the harmonic mean is always less than or
i=l ~
equal to the arithmetic mean:
m
(~
i=l
2
2-1
m
w.~.)( ~
~ ~
w.~.
~ ~
i=l
)
~
(4.2)
1 •
Furthermore, since the value of a weighted arithmetic mean is
a1w~ys
less than or equal to the largest value being averaged:
m
r:
i=l
2-1
w.~.
~ ~
-1
Am
:;;
(4.3)
Using (4.2) and (4.3) in (4.1) we get
ST(r, k)
~a
2
m
i=l
~a
2
~
i=l
m
r: ~:1 + k2
i=l
=a
m
2
r: ~:1 + k2 - a r:
~
2 m-1 ~:1 + k2
r: ~
i=l
- a
2 -1
w.~.
~
~
2 -1
~m
•
11
Noting that equality holds in (4~2) whenever
of
i
w =1
m
w
= 0,
m-1
o I> • ,
wi > 0
for which
,
0, 1)
and in (4.3) whenever
for all values
= w2 = ... -
wI
we see that both equalities hold when
in which case
f2m vm'
8 (
T
=A
A.~
k) =a
r
112 v .
= V1\1/2w = Am
m
2 m-1
L:
A: 1 +
~
i=l
k
= (0,
Wi
0,
Thus,
2
and the resulting estimator becomes
The estimator thus derived can be seen to be of the form
~
S
=
~
Fe
,where
OL8 estimator.
P
is an orthogonal projection matrix and
~
The e~timator can be obtained by projecting
is the
~
S
into
the space orthogonal to the characteristic vector corresponding to the
smallest characteristic root of
XiX •
Another interesting feature of this result is the fact that the
estimator does not depend on the value of
specified before hand.
Of course, the resulting mean squared errors
The General Case
For the general case we shall aSSume that
(1
and, hence, need not be
k.
will involve
4.2
k
~
u
~
m) •
u
is a fixed number
The solution has the appearance of that for
u
=1
but
the pro?f of the theorem is more involved.
Theorem 4.4:
to a set of
•
For the class of least squares estimates computed subject
u
indepei)dehtlypossibly incorrect restrictions,
m-u
L:
i=l
-1
)...1. +
k
2
12
with equality holding whenever
given (2.7) and (2.8).
R =
Ai/2V~
,where
A2
and
V2
are
That is, for this class of estimators the
DMfMSE estimator is
Proof:
R=(BB,)-1/2 BA l/2 v ,
We can write
matrix of rank
=
u.
where
B is any
(uxm)
From this we have that
-1
m
EA.
i=l ~
m
~
-
i=l
-1
P.. A.
~~ ~
P = B'(BB,)-lB = «P .. » . Since
is symmetric idempotent
B
~J
m
2
we have that
~
P .. = P .. which implies that p .. ~ o. Further, if
~~
~~
j=l ~J
2
2.
2
1
Pii > 0 we have that o = ~ P .. ..v Pii - PiJ' ;:: P~.·.i- Pi.:i - Pii ~.Pii':S: •
j*i J J . .
m
-1
m
well known that
~
p .. = u. There fore,
~
P .. A. ,is maximized
m
i=1 ~~ ~
i=1 ~~
subject to Os: p .. :S: 1 and
~
p .. = u by choosing. P
= •••
~~
11
i=1 ~~
.. P
= 0 and P(m-u+1)(m-u+1) -- ••• -- Pm- 1 giving
(m-t.«m-u)
where
m
~
i=1
-1
PHAi
m
=
~
i=m-u+1
This corresponds to taking
o
o
0
•
•
···
ru
It is
-1
Ai
B=
[o:r
]
• u
and
•
13
~
Using this we have
tr[S
-1
1
1
- S- R'RS- l ~
m-u
L::
"':
i.=: 1
1
(4.4)
1.
Next, we have that
where
Q
=:
1/2
A
suppose that
PBA
~
Furthermore, if
-1/2
e C(Q)
~
'Ie
• Note that Q2
~
then
= iJt..iJ ~iJ)
= Q~
=:
Q but is not symmetric.
for Some
1/2
*
then QiJ =
~
~'c
QiJ
m and, hence,
and
IIQiJ
*n 2 = lI/
e
11
2
Now
= iJ
= 1 •
Therefore
ChM(Q 'Q)
=
sup
x(x'x=1)
IIQxl1 2
~ 1
and
(4.5)
Using
(4~4)
and (4.5) we may conclude that
~C'
2 m-u
L:
i=l
-1
A.
+k
2
(4.6)
1.
It remains to demonstrate that the lower bound is attainable.' Putting
B := (O:I )
• u
we get
R
=:
(BB') -1/2 BA 1/2VI
:=
A21/2Vi2 and substituting
this into the left . hand side of (4.6) gives
•
•
14
=r:J
2
m-u
L:
i=l
which is the right hand side of (4.6).
Finally, the estimator can take
various forms:
where
4.3
S
is the OLS estimator.
Some Properties of the Estimator
Marquardt
[Al
proposed a class of regression estimators he called
Generalized Inverse estimators.
He argued that, as an alternative to
using precise computing methods for obtaining "exact" solutions in
situations where the
X'X
matrix has Some small but positive
characteristic roots, it might be preferable to assign a lower rank to
X9X and obtain a solution under this assumption.
In our notation, his
estimator can be written as
where
A
l
V is the assigned rank of
X'X
are defined in (2.7) and (2.8)0
Theorem 4.5:
The DMrMSE estimator for the class of least squares
estimators computed subject to possibly false restrictions is equivalent
to Marquardt 9 s Generalized Inverse estimators when the assigned rank of
X'X
equals
m- u •
•
15
Proof:
Since we are assuming that
A+
S
XIX
is of full rank,
+
"" A vX'y
Some further results which follow easily from previous results
are summarized without proof in the following theorem.
Theorem 4.6:
(m x 1)
For any
Var(~IS) ~ Var(~'S)
(i)
E(~'S)
with equality if, and only if, ~ e C(V )
1
~ IS = ~ IS •
in which case
(ii)
~
vector
= tIS +
~'V2V2S
and the second term (the bias)
~;
vanishes if, and only if,
(iii)
The estimator
(iv)
For the more general model
S
.
e C(VP or S e C(V 1)
f'W
is shorter than
A
,
S
i·~·,
""'-'f'IIt,I
s's
""
< S'S
2
(y, XS, La ) , where
ll:
I :j: a ,
the corresponding estimator is of the form
...S
= ~~i(X'l:-IX)-lX'l:-ly
where
X'l:
-1
X
= ':"~
vAV
,.."."
, VV'
,
= f'J""
V'V = I
is a
Toro and Wallace [7J
,
(m x m - u)
is d iagona 1,
matrix.
give a statistical test for determining
whether imposing a given set of possibly false restrictions,
will result in a reduced mean square error.
RS "" a
This test is appropriate
for use in this setting provided we make the assumption that the errors
•
in the model are normally distributed.
test is
In this case, the form of the
16
Reject H
if
o
W~ W
OJ
W = [yrUzuzy/u]/[yr(I - UU')y/(n-m)]
where
in (Z.4) and (Z.6).
and
Note that the hypothesis
U and
U
z
are defined
states that
H
o
i.s
13
"
superior to the OLS estimator,S,
with respect to mean squared error
,..,
....
and accepting H suggests that 13 is preferred over 13 • It can be
o
seen that
W is a noncentra1
F
random variable with
6
6
~
AZ are defined in (Z.7) and (2.8).
I3'VzViI3Am/2a2 ,where
Am
(n - m)
It can be seen also that
is defined in (Z.4) and (Z.5).
is small whenever the projection of
is small.
and
Z
r
6 = crV
I-'
ZAZvZS/2a , where
degrees of freedom and noncentra1ity
and
u
13
on
C(VZ)
Therefore,
is small or
'I\m
The latter condition is precisely that for which the
Generaliz~Inverse estimators
are intended.
Some critical points and
power computations are given in [:7.1·.
5.
THE DMTMSE ESTIMATOR FOR THE ClASS OF
SHRINKAGE ESTIMATORS
Many regression estimators have been proposed that have the form
,..,
S
= Ay
~
vrUry ,where
diag(Y1' ••• , Ym) .
j
= 1,
•
lit •
,
U and
Included in this class are OLS
m) , Generalized Inverse regression
= 0,
j
+ k Z) ,
=m-
u
j ., 1,
r
V are defined by (Z.4) and
+ 1,
• • 4) ,
is discussed by Goldstein and Smith
o •• ,
(y j
(Y j
=
= Aj-l/Z
,
= A-l/Z
j
, j = 1,
m), Ridge regression
m) , as well as others.
This class
[21.
In the following theorem, the DMTMSE estimator for this class is
derived.
•
17
2
(y, XS, Ia ) , the DMI'MSE estimator
In the linear model
Theorem 5.1:
from the class of estimators of the form "'"
S = vru'y , where
is the OLS estimator,
Proof:
and
V
ym) , is "'"
S = tS" , where
2
m -1
2
2
2
t = c /( ~ "'. + c) and c = k /r:/ °
j=l J
r = diag(y l'
are defined by (2.4) and
~
U
.00,
First,
=a
2
m
2
y.
j=l
J
~
+
2
1/2.
2
Aj ' Yj) , j = 1, 2, ... , mJ
k max[ ( 1
Suppose that the value of the second term is
2
P
Then any set of
1/2.
2
2
Yj'S that minimizes ST must satisfy (1 - "'j" Y ) ~ P
for
j
-1/2
j = 1,
m and, hence, y. ~ (l-p)",.
, j = 1,
000,
J
is a minimum, therefore
(1_P)~~/2 , its smallest
must equal
Y.
J
m
possible value, so as to make
22
= (l-p)
J
~
'-1
m J~ A: l
2
y.
as small as possible.
J 2 2
+- p c
This im-
Since this is concave in
.
j=l J
p , the minimum can be found by setting first derivatives equal to zero
ST/a
plies that
resulting in
Thus,
2
Y. = [c / ( ; ",:1 +
J
j=l J
•
where
c
2
2
= k /a
2
•
c2)J~~/2
J
18
The estimator obtained in Theorem 5.1 is a deterministically
shrunken estimator as defined and studied in Mayer and Wil1ke [5J.
Its form is simply.. a scalar times the OLS estimator.
The scalar factor
is between 0 and 1 and, as such, has the effect of shortening the length
of the vector
S.
6.· THE DMrMSEESTlMATOR FOR THE ClASS
, :OF GENERAL LINEAR FUNCTIONS
The model
= ~ + e = UA1/2V '13 + e
y
,....
can be decomposed as follows.
Let
n - m which satisfies UfU= 0
the orthogonal complement of
[.~;]
where
E(e )
i
=0
y =
H;}
U •
=
and
11''11
matrix of rank
,....
That is, let U be
(n x n - m)
= I
n-m
Then
[~~~:~:]
13 +
[~;j
,
sensible to base the estimation of 13
of estimating
an
U
Thus, it seem
solely upon
Yl.
For purposes
S , then, we shall rewrite the model as:
(6.1)
e
19
Theorem Sol:
-Sl
= tYl
Proof:
el
The DMTMSE linear estimator of
' where t
= c2 /(m + c 2)
(Cohen [lJ).
Since
X
and
=I
Yl
in (5.1) is given by
is defined in (6.1).
we have
Suppose that A is not symmetric, then define A*
which is symmetric.
ole
Note first that
'Ie
ChM[(I-A ) (I-A ')] • Also
*
~
tr(A)
implies
tr[(I-A)(I-A')]
tr(AA')
~
A "" GrG'
~
~'~
ST(A Yl ,l3 l )
0
=
= tr[(I-A*)(I-A* ')] Thus,
+ tr(A*A* ') • Therefore,
'/~ '/~
tr(A A ')
0
tr{[(I_A)(I_AI)ll/2} ~ tr(I-A) , which implies
ST(AY1,Sl)
- [(I_A)(I_A,)]1/2
ChM[(I-A)(I-A')]
tr(I) -2tr(A) + tr(AA') = tr(I) - 2tr(A'/~ )
tr(A)
=I
But
'/~
tr(I-A) =
tr(A)
~
tr(A* ) •
Thus,
Assuming that A is symmetric, we write
and
=
a
2
n
L:
j=l
2
y.
J
+
k
2
2
max ( (1 - Y .) , j = 1, ••• , m)
J
0
Using the identical argument of Theorem 5.1, the minimizing value of
2
2
Y = c /(m + c ) , j = 1, .0., m 0 The choice of G is arbitrary.
j
Using the result of Theorem 6.1, an estimator for
obtained in terms of
where
t
= c 2/(m + c2)
can be
y:
0
This is a solution for the class of general
linear estimators in the decomposed case given by (6.1).
~
S
cause the DMTMSE estimator of
CS
However, be-
is not necessarily equal to
C times
20
that of
S
alone, it is not the solution for the original model.
deed, the form of the estimator in the present case is
some
H.
VA-1/2HU 'Y
Infor
.
e
21
.
~
REFERENCES
[lJ
Cohen, Arthur, "All admissible linear estimators of the mean
vector," . Annals of Mathematical Statistics, 37 (April 1966),
458-463.
[2]
Goldstein, M. and Smith, A.F .M., "Ridge-type estimators for regression analysis,"
Journal of the Royal Statistical Society,
Sere B, 36 (June 1974), 284-291.
[3J
Hoerl, Arthur E. and Kennard, Robert W., "Ridge regression: Biased
estimation for nonorthogonal
problems,~
Technometrics, 12 (February
1970), 55-67.
[4J
Marquardt, D. W., "Generalized inverses, ridge regression, biased
linear estimation, and nonlinear estimation," Technometrics, 72
(August 1970), 591-612.
[5J
Mayer, Lawrence S. and Willke, Thomas A., "On biased estimation
in linear models," Technometrics, 15 (August 1973), 497-508.
[6J
Pringle, R. M. and Rayner, A. A., Generalized Inverse Matrices
With Applications to Statistics, New York: Hafner Publishing Co.,
1971.
[7J
Toro-Viscarrondo, Carlos and Wallace, T.D., "A test of the mean
square error criterion for restrictions in linear regression,"
Journal of the American Statistical Association, 63 (June 1968),
558-572.
.
e
APPENDIX
Alternative proof of Theorem 6.1:
Since X
= I,
we have
(1)
Write
A
= Gf'H '
, where G and
tr(AA I)
H are unitary matrices, therefore
2
= tr[GrH'HrG ' ] = trrr J =
so that the trace term is independent of
2
m
~
Yi
i=l
G and
H.
Note now that
= ChM[(1-GrH')(1-HrG')]
ChM«1-A)(1-A ' »
= ChM[ (G'H-r) I (G'H-r) '] =
= Ch~(U-r)(u-r)]
2
= lI(u-r)1I = 1I1-u'rll
where
U = G'H , and
1101\
2
(2)
denote the matrix norm induced by the
euclidean norm (known as the Spectral Norm)*.
Take y.
~
e!~
= [0,
such that
... , 0, 1, 0,
(I - U'r)e.
~
*See
ll1-rl\
•
0
•
= e.
~
,
=
'l-y il ' since
reo
~
= y.e.
~
~
where
OJ, it follows that
- U're.~
= e.
~
- y.U'e.
~
~
Lancaster Peter, Theory of Matrices, Academic Press, New York and
London, page 210.
where
2
Q'l
+
Il c ll 2 = 1
and
c
is orthogonal to
e .•
~
.
e
Hence
(3)
by applying (3) in (2) we conclude
c~[(U-r)'(u-r)J ~ ChM«I-r)'(I-r»
and the equality is attained letting
G
=H
, hence, we write
=a
A
2
= GrG'
m
~
j=l
G'H
=U = I
and
2
2
2
y. + k max[(l-y.), j
J
which implies that
J
= 1,
••• , mJ
using the identical argument of Theorem 5.1, the minimizing value of
2
y.~
c
= --~
2 ,
m+c
j
= 1,
•.• , m •