Download Barizi.; (1973)An assessment and some aplications of the empirical Bayes approach to random regression models." Thesis.

AN ASSESSMENT AND SOME APPLICATIONS O~M EM.PIRICAL BAYES APPROACH .,., .IANroM REGRESSION M>rELS by BARIZI Institute of Statistics Mimeograph Series No. 85P, January 1973 vi TABLE OF CONTENTS Page LIST OF TABLES • . •• vii 1. INTRODUCTION.. 1 2. REVIEW OF LITERATURE 4 3. BAYES AND EMPIRICAL BAYES ESTIMATION IN RANDOM REGRESSION MODELS • • • • • • • • . • . • • • . . • • • 3.1 Bayes Estimation In General • • . • . . . . . 3.2 Least Squares Estimation . . . . • • . . . • • 3.3 Bayes Estimation In Random Regression Models 10 12 3.5 Remarks on Empirical 43 13 23 3.4 Empirical Bayes Estimation • • . . • • • • • 4. ~aye~ Estimation . . . . . COMPARISON OF EMPJ;RICAL BAYES TO OlIDINARY LEAST SQUARES ESTIMATORS • • . • • • • • • • • 49 4.1 Objeotives Of The Study. • • . . . . . . . • 4.2 Generation of R8Jldom Variables c, ex, 13 and ~ 4.3 Comp~tations Of The OLS And EB Estimates 4.4 Results . . . . , . . . . . . . . . . 0 5. 6. 10 • 50 52 53 58 • APPLICATIONS TO SOME PROBLEMS IN ECONOMETRICS 68 5.1 lnvestment Analysis • • • • • • • . . 5.2 Elasticities of Substitution ••• 68 SUMMARY, CONCLUSIONS AND RECOMMENDAT;I:ONS • • . 74 6. 1 SUIllIIlary. . • . • • • • . • . . . 6.2 Conclusions and Recommendations 7. REFERENCES 8. APPENDIX... ...... .. ... 72 .. 74 75 77 80 vii LIST OF TABLES Page 3·1 Relation of C.V. and b/a • . . • 4.1 Several 4.2 4.3 ~ . . . . . . . ~ and ~ 2 Estimates for the marginal bias of 3 . 52 •••• 58 . . . . . ... Ratio of the average squared error of EB to 016 estimators for cas\=s 1, 2 and 3 • • • • • •• arld 4.4 of prior densities of c, a, form~ studied. . 48 ~2 ~ for cases 1, 2 . . . . . . . . . . . . . 60 The +atio R for cases 1, 2 and 3 after applying several ~ correction fa~tors (C.F.) for ~2 ~ • • . • • . • 62 ~2 Estimates for the marginal bias of .80 ~ for cases 1, 2 8J1d 3 . . . . . . . . . . . . . . . . . 4.6 The average of EB estimates for ~2 (uncorrected) and estimates of its marginal bias for case 1, where p = 2 and N = 10 . • • • • • • . . . . 64 The ratio of the average squared error of EB to OLS estimators for cases 4, 5, 6, 7, 8 and 9 . . . OLS and EB estimates for regression coefficients and error variances for the ten corporations • • • 70 5·2 Zellner's two-stage L8 estimates for regression coefficients for the ten corporations • • • • • • • • 71 5·3 OLS and EB estimates for elasticities of substitution in import demand between flue-cured tobacco from the United states and tobacco from other countries . • • 73 Generated values of c, a, ~, ~2 and their OLS and EB estimates, taken from one replication, where N = 10, T = 15, for cases 1 through 9 . . . . . . . . . . . 81 5·1 8.1 INTRODUCTION 1. Linear regression models have been widely used in econometric analysis using time series and/or cross section data. In many cases reported, the regression coefficient vector as well as the variance of disturbances is assumed to be fixed in successive observations. However, in some applications, the constancy of the coefficient vector may be questioned. For example, suppose a particular coefficient represents the response of a plant to nitrqgen fertilizer. It is well-known that this response is strongly influenced by temperature, ~ainfall and ~oil characteristics. If these can be held constant, we might expect that the coefficient is constant. If they vary but can be observed, it is desirable to incorporate their influences into the model, but if they are unobserved, we might be better to assume the coefficient to be random across locations and fixed within each location. Similar considerations may also apply in econometric studies. observed by Klein As (1953), it is unlikely that interindividual differences observed in a cross section sample can be explained by a simple regression with a few independent variables. In deriving the production function, the supply function for factors, and the demand function for a prOduct, Nerlove (1965) found it appropriate to treat the elasticities of output with respect to inputs and of factor supplies and product demand as random variables differing from firm to firm. Kuh (1959) and Nerlove (1965) treated the intercept as random and the slopes as fixed parameters in estimating a relationship from a time series of cross sections. Zellner (1966) pointed out that there 2 would be no aggregation bias in the least squares estimates of coefficients in a macro-equation, if we assumed the coefficients of the micro-equations to be random. By assuming the coefficient vector to be random across units, Swamy (1970) presented a consistent and asymptotically efficient estimator for the mean and an unbiased estimator for the variance-covariance matrix of the coefficient vector. If we can assume the coefficient vector to be random across units, but fixed within units as proposed by Swamy (1970) there is no reason why we cannot assume that the X matrix and/or the variance of residuals are also random across units but fixed within units, since we may consider these units as a random sample from a larger population. These are the regression models, !. ~., by assuming that the coefficient vector or the variance of residuals or the X matrix or some combinations of them are random across units but fixed within units, that will be studied in this thesis and we may call them random regression models. The regression models described by Swamy (1970) above are called random coefficient regression models. As pointed out by Swamy (1970), there is a close connection between random coefficient regression models and Bayesian inference, since the latter is also based on the assumption that a parameter in this case the coefficient vector -- is considered as a random variable with a fixed known ~ priori distribution. Any restriction or prior knowledge about the possible values of a parameter can be incorporated quite readily via an ing the analysis of the data. ~ priori distribution, before perform- However, a common objection for Bayesian inference is that the requirement of the form of the ~ priori 3 distribution to be assumed known, which many statisticians consider it to be unrealistic, although Jeffreys (1961) suggested to use a diffuse prior in case there is no knowledge at all about its form. Bayesian analysis has been applied to several econometric problems, for examples, Tiao and Zellner (1965) on mUltiple regression models with autocorrelated errors, Zellner and Park (1965) on a class of distributed lag models, Zellner and Geisel (1970) on distributed lag models with applications to quarterly consumption function estimation, Chetty (1971) on Solow's distributed lag models and Chetty (1968) on pooling time series and cross section data. In this thesis we shall present another approach, called empirical Bayes (Robbins, 1955), in estimation of the coefficient vector and the variance of disturbances for random regression models. Four multiple linear regression models are considered and empirical Bayes estimators for each model are presented. The difference between empirical Bayes and Bayesian inference is that the first does not assume at all about the knowledge of the form of the prior distribution of a parameter. A comparison of empirical Bayes to ordinary least squares estimators is also given for a special case of a simple linear regression model. 4 2. REVIEW OF LITERATURE The empirical Bayes approach was introduced by Robbins (1955). It is applicable when the same decision problem presents itself repeatedly and independently with a fixed but unknown distribution of the parameter. ~ priori The statistical decision problem that we consiqer comprises: e (1) A parameter space (2) An observable random variable space 1 fined. on which a f(xle) ~ function of X has a density ~. ~ =e and &. In & is a x. A loss function L(&,e)~ 0 , representing the loss we incur in taking 0 the true parameter is (5) e, is de- ~ with generic element estimation problem we take (4) measure ~-finite with respect to A decision space e. X belonging to a When the parameter is function (3) with generic element as an estimator of e, when e. An ~ priori distribution of G of e defined on ® , which is unknown. The problem is to choose a function observe x we shall take incur the loss L[&(x),eJ o(x) 0, such that when we as an estimate of The expected loss when & and thereby e is the true parameter is given by R(o,e) SlL[&(x),e]f(x\e)$(x) . (2.1) 5 Hence, the overall expected loss when e has an ~ priori distribution G is (2.2) which is called the Bayes risk relative to G. Rewrite (2.2) by using (2.1), R(5,G) :; Ie Sx. L[5 (x), eJf (x le)d,u (x)dG(e) (2.3 ) :; .fx. ~G[5 (x),x]d,u (x) where ~G[5(x),xJ :; The objective is to find IeL[5(x),eJf(x\e)dG(e) 5 G (2.4) • such that ~G[5G(x),xJ :; min ~G[5(X),XJ 5E$) which means that for any (2.5) OEl), we have (2.6) Since no assumption is made about the form of (2.6) cannot be derived and, therefore, However, an asymptotically optimal 0G 5 G, (2.2) through cannot also be obtained. relative to 5 , was proposed by Robbins (1964). We may consider N estimate of 5 . The procedures are as follows. G by G, denoted 5 as an N 6 Let (xu-' eN) (xl,e l ), (X ,S2)' .•• , 2 be a random sample of size N from a papulation of (x,e)'s whose joint distribution function is F(x\S)G(e). Xl' X2 ' ••• , X are observable random N Note that variables, while e , e , .•. , eN l 2 are unobservable. Now, construct the following function so that we shall take thereby incur the loss 6N(~+1) E ~ as an estimate of L[oN(~+1),eN+1J. the expected loss in estimating e + , and N l For a given sequence e N+l , given Xl' x 2 ' ... , ~ (ON} , by (2.3), is and hence the overall expected loss is given by where E denotes the expectation with respect to the N independent Xl' X ' ... , xu- ' which have a common (marginal) 2 density with respect to g on 1, given by random variables (2.8) From (2.6) and (2.7), it follows that , 7 If a sequencr (ON} can be found such that lim R( oN' G) == R(G) , N-ioo we say that [ON} is asymptotically optimal relative to G. The problem now is whether such a sequenre exists and, further, whether a sequence every G € q [ON} which is asymptoticallY optimal relative to q can be found; distribution on may be the class of all possible ®. One way to obtain [ON} is to find a sequence as a sequence of distribution functions of as N ~ 00, i. e such that d GN(e) ~ G(e) ~., lim GN(e) == G(e) N~oo at every continuity point of G. (1964) presented a general Robbins method for constructin~ a particular sequenc~ GN(e) for an unknown G(e) and then proved a theorem that under appropriate conditions on the family of hold for any as estimators G whatever. f(xle), GN(e) ~ G(e) Denoting FG{x) == as N-> 00 will LOOooF(xla)dG(e) , these conditions are: (1) For every x, F(xle) is a continuous function of e ; (2) Both lim e~ F(x I 00 - 00 F(xle) == F(xl- 00) exist for every x and lim e~oo f(xle) == 8 (3) Neither F(xl-c:o) nor F(xlc:o) isa distribution function; (4) If Gl , G 2 of e (1964) Robbins are any distribution functions such that , then also gave a method of obtaining directly a sequence (&N} , which bypasses the estimation of of f(xle) G, if the parametric family and the loss function are given. Several other works on empirical Bayes techniques on estimation or testing hypotheses were reported by Samuel (1968), Clemmer and Krutchkoff and Krutchkoff (1969) Martz and Krutchkoff (1969). and Maritz (1969) of Martz and Krutchkoff (1963), (1967), Krutchkoff (1969), Rutherford Among these works, only that will be cited below, since it is closely related to the work presented in this thesis. They considered a sequence of experiments using a multiple linear regression model y. = 2$. + where X..J. E., -~ -~ is a k X 1 for -~ i (2.10) = 1, 2, .•• , N observable random vector, X is a k X P matrix of nonstochastic independent variables which is common to all experiments, ~. -~ is a p X 1 realization of a random vector prior distribution G(~) and coefficient vector which is the ith ~ E. -~ with an unknown and unspecified is a k X 1 random vector distributed as normal with mean 0 and common known variance ~2I By making use of observations from previous experiments, they derived an empirical Bayes estimator for the coefficient vector in the Nth 9 experiment. They also pointed out, by a Monte Carla study using a simple linear regression model, that the larger N the smaller the ratio of the average squared error for the empirical Bayes estimator of ~N to the average squared error of the maximum likelihood estimator. With the number of experiences as small as five, a substantial improvement for the empirical Bayes estimator was obtained in many cases. The degree of improvement is also affected by the form of the prior distribution of an exceptionally diffuse prior will give a ~, ratio of average squared errors close to one for every value of If ~ 2 N. is unknown, as is the case in practice, they suggested to use the least squares estimator for experiment) data, where By using s2 instead of £N ~ 2 by using the present (the Nth is the least squares estimator of 2 ~ ~N , they found only a slight effect on the ratio of the averag! squared errors. 10 3. BAYES AND EMPIRICAL BAYES ESTIMATION IN RANDOM REGRESSION MODELS Before we consider the mUltiple linear regression model, let us briefly review Bayes estimation in general. 3.1 Bayes Estimation In General Let be an observable random variable in a Euclidean space X with a conditional density f(xle), where e is the parameter. 1 e is assumed to be a realization of an unobservable random variable whose (prior) density is over a parameter space ® • g(e) The joint density of X and h(x,e) the marginal density of e e is f(xle)g(e) (3.1) X is (3.2) and the posterior density of e p(alx) = given X =x is ~«X») (3.3) g By using a quadratic loss function an estimator of e, ,.. ,.. L(a,e) = (a-e) 2 ,.. ,where e the overall (Bayes) risk is given by (3.4) The Bayes estimator for ,.. respect to a, namely, a is obtained by minimizing (3.4) with is 11 (3.5) Note that .... e is a function of B x only, therefore, its expectation is given by (3.6) = SxSeeh(x,e)dedX = Jeeg(e)de = E(e) • .... e unbiased. Further, it will be shown that Bayes estimator based on a sufficient statistic for Let is called marginally B e is the same as that of (3.5). T be a sufficient statistic for factorization theorem (Kendall and stuart, e. By the Neyman 1961), f(xle) can be 1 factorized into (3·7) f(xle) = f(tle)q(x) where f (t Ie) is a function of of e given T over 3' is the conditional density of x X=x that does not involve as in (3.3), and q (x) The posterior density e by substituting (3.7), can be written as p(elx) = 1 ~he symbol 1~~:I:~~t:~de = ...,....--;.:~~~.f-- = f(e\t) • (3.8) f in f(xle), f(tle) and f(elt) do not represent the same functional form, its use 1S merely to indicate the density function, whatever form it might take. 12 Substituting (3.8) into (3.5) gives us (3.9) which is what we wanted to show. In the next presentation about mUltiple linear regression model, Bayes estimators will be based on sufficient statistics rather than the original random variables. 3.2 Least Squares Estimation Consider the following mUltiple linear regression model (3.10) y.. is where ~ T Xl, X is a is distributed with mean unknown fixed parameters. ~ estimators for and ~ 0 TXP nonstochastic with full rank, and variance ~ 2 2 (~,~) I ,and are It is well-known that the least squares 2 are respectively given by (3.11) and By additional assumption of normality of are also maximum likelihood estimators for after correction for bias. for ~ and s 2 b ~ (3.11) and (3.12) and 2 ~ , respectively, is the best linear unbiased estimator is the best unbiased estimator for ~ 2 . 13 The density function of S = (T-p)s 2 • (Kendall and Stuart, 2 ~ and S can be expressed as 1 exp[- ~[S + (b - t3)'X'X(b - t3)J} (a\[2II)T 2ci -- = where ~ Therefore, by the Neyman factorization theorem 1961), b and S 3.3 are jointly sufficient for Further, it has been shown (Graybill, 1961), that N[~,~2(X,x)-lJ and are independently distributed as (T-p) (3.13) £ ~ and X2~2 with degrees freedom, respectively. Bayes Estimation In Random Regression Models In this section, four variants of multiple linear regression model and their respective Bayes estimation will be presented. The four variants of the model will be denoted by model A, B, C and D as described below. and Krutchkoff Bayes estimation for model A follows that of Martz (1969), while those for models B, C and D were developed analogous to their results. Model A: ~ 'IX 1 = X ~ 'IXp IP< 1 + € 'IX 1 where (a) € (b) X and is distributed as r:l 2 NlQ,~ I) ; are fixed and known, where is a matrix of full rank; X 14 (c) ~ is an unknown realization of a random variaqle ~ with a known prior density Note that once the random variable assumed to be constant over estimator for T ~ g(~) takes a value successive observations. will be based on a sufficient statistic ~ ~, The b it is Bayes for ~ as in (3.9). The conditional density of b is given by 2 IX/Xr~ 1 f(bIX,R~ ) = ~ exp(- ---2(Xb - ~)/(Xb - 2~ (o\[11)P 1:1. - - - - ~)} • - (3.14 ) Aside, 1 - --- X/X(b - ~) 2 -~ or ~ = £ + ~2(XIX)-1 1 2 ~b f(£lx,~,~2) f(£ IX,~, ~) - The posterior density of ~ given 2 (X,£,~) is p(~IX,b,~2) = h(Q,~, Ix,~2) = ~X,~,~'g(~) where - h (blx,~2) g- (3.15 ) h (blx,~2) g- (3.16 ) h(£,~IX,~2) is the conditional joint density of b and ~, while Therefore, from (3.9), the Bayes estimator for ~ is given by 15 By substituting (3.15) into (3.17), we obtain 1 A ~=---- h (b \x,<i) S[£ 2 -1 + ~ (XIX) g- 1 I a 2 2 ab f(£ X,~,~ )Jf(£IX,~,~ )g(~)~ f(£IX,~,~) 2 (3.18 ) where h (bIX,~2) g- the first partial derivative of From with respect to b. (3.6), we have that ~ is marginal~ unbiased, ~.~., E(~) = E(~) . Note that hg(£lx,~2) depends on g(~) which we denote by putting a subscript g. Model B: ~ .l=x 'IX 1 +€ 'IXp pXl ']Xl where is distributed as N(Q,~2I) ; (a) € (b) X is fixed, known and with full rank; (c) (~,~2) is an unknown realization of a random variable density (~A,_, ,.,.2) v 2 g(~, ~ Wl" ) • th a kn own JOln ." t prlor " 16 From (3.13), we know that b and = (T-p)s 2 are jointly sufficient (~,~2) and are independently distributed as N[~,(X/X)-1~2J and for X2~2 with (T-p) ~ estimators for degrees of freedom, respectively. and 2 ~ density of b conditional density of freedom. and Denote f(£IX,~,~2) N[~, (X/X)-1~2J which is S The Bayes , therefore, will be based on these 2 sufficient statistics. X S which is to be the conditional and X2~2 fT_P(sl~2) with Note that the conditional density of (T-p) S as the degrees of does not depend on ~. For a given (X,~,~2), it can be shown that £ and independent, therefore, the conditional joint density of denoted by b are and S, 2 fT_p(£,SIX,~,~), can be expressed as the product of the conditional density of Since S S b and that of is distributed as S, ~.~., X2~2 with (T-p) degrees of freedom, we can write (3.20 ) or, 2 The symbol f does not have to represent the same functional form, it merely indicates the density function and the index (T-p) or (T-p-2) of f indicates the degrees of freedom. 17 s t(~) S ~(T-p-2)-1 S(-2) e cr 2 = -cr--....-01.,.(T--2.....,)~T--2--T-, -2- = -T_-S-_-2 f T_p _2 (S Icr ) ~ -p~ (-p-) P ) ( 2 2 2 r 2 (3.21) f _ _ (slcr2 ) is the conditional density of TP 2 type with (T-p-2) degrees of freedom. S which is a X2 _ where The posterior joint density of 2 f(~,cr IX,E.,S) and ~ cr2 , given (x,.!?, S) , is 2 2 fT_P(E.,slx,~,cr )g(~,cr ) f (b, S IX) T-p - (3.22) where the numerator of (3.22) is the conditional joint density of E., s, ~ and cr2 , while ~ From (3.9), the Bayes estimator for A ~ ~ 2 = E(~lX,.!?,S) = SJ~f(~,cr is given by IX,E.,S)~dcr 2 . substituting (3.15) and (3022) into (3.23) gives us A f3 - = SS[b - + cr2( XI X)-1 0h f (I E. X,~,cr 2) 0.... f(bIX,~,cr2) 2 fT_P(E.,S IX,~,cr2) g(~,cr) ] - Further, by making use of (3.19), we have, fT_p(E.,slx) ~dcr2 18 == l? + f Next, (X/X)-l (b SfX) T-p -' app~ng -b+ - - SS[ab 0 - I 2 2 I2 2 2 f(l? X,~,O" )J[O" fT_P(S 0" )Jg(~,O" )~dO" • (3.21), we obtain S(X/X)-l T-p-2 f T(1) __ 2(l?,S IX) ~ f -p -b,S X) T (3.24 ) where Similarly, from (3.9), the Bayes estimator for ,,2 0" == E(O"2\ X,l?,S) == 0" 2 is given by SS 2f(~,O" 21 X,l?,S)~dO" 2 • 0" Substituting (3,22) and (3.19) successively, gives us 2 2 2 2 ,,2 _ 0" f T-p (l?, S IX,~, 0" )g (~, 0" )~dO" 0" fT_p(l?,slx) SS _ - SS f(l?IX,~,O" 2 )[0"2fT_;p(S 10"2 )Jg~,O" 2 )~dO" 2 f Now, by using (3.21), we have (b,S\X) T-p - • 19 Note that both " ~ ,," 2 and and 0- g (~, depend on f(l) (b SIX) T-p-2 -' 2 0- ) f _ (£, T p since , g(~, depend on 2 0- ) • Model C: X Y.. 'IX 1 f3_ 'IXp :px 1 € + - 'IX 1 where (a) E (b) (X,o-) is distributed as 2 variable 2 N(Q,o- I) is a known realization of a random 2 (X, 0- ) , while ~ is an unknown realization of a random variable three random variables X, ~ The ~. and 2 0- have a known joint prior density (c) X is a matrix of full rank. Observe that Model C is similar to Model A in the sense that 0- 2 X and are known, therefore the only parameter to be estimated is The posterior density of ~ given 2 (X,~,o-) ~. is given by (3.26 ) 20 where and the numerator of x, ~ and 2 CJ (3.26) is the joint conditional density of b, • ~, Therefore, the Bayes estimator for By substituting (3.15) the Bayes estimator for into (3.27), using (3.9) analogous to and (3.18), is we obtain as follows: ~ (1) 2 1 h CJ (X/X)g 2 (b,X,CJ) 2 h (b,X,CJ ) g- where Model D: Y.. 'lXl ~ X = pXl 'JXp + E 'lXl where 2 N(0,CJ I) ; (a) € (b) X is a known rea.lization of a random variable is distributed as X , while (3.26) ? (~,CJ-) is an unknown realization of (3.28 ) 21 a random variable (~, (l). variables X, prior density (c) 2 and cr g(X,~,cr 2 ~ The three random have a ,known joint ) X is a matrix of full rank. Note that Model D is similar to Model B in the sense that 2 (~,cr) and X is known is the parameter to be estimated. The derivation of Bayes estimators for to that of Model B, namely the Bay'es estimator for ... ~==b+ S(X/X)-l T-p-2 cr2 and ~ ~ is analogous is given by f~:~_2(X,:£,S) (3.29) f T-p (X,b,S) - where (1) 0 f _ _2 (X,:£,S) == 0:£ f _ _2 (X,:£,S) , TP TP and the Bayes estimator for ... 2 (J' In this case ... ~ 2 cr S f T_ _2 (X,£,S) P == T-p-2 f (X,b,S) T-p - and ~2 depend on and (X,:£, S) , Remarks: is given by g(X,~,S) f (1) _ _ (X,:£,S) TP 2 (3.30) since all of depend on f . T-p, g(X,~,cr2). If we are dealing with a single mUltiple linear regression model, Model A and Model B, ~.~., the models with fixed X, are the ones that are commonly used in Bayes estimation. However, when we are dealing with several regression equations, where the X matrix as well as ~ and cr2 may differ from one regression equation to another, we might be willing to assume Model C or D. 22 For Mode.l A and C, we assume ttat cases is not the case. s '::J O"~ If 0" 2 is known, whi chin mos t is unknown, one might wonder of using 2 2 2 , the least squares estimate for O' , instead of the true 0" to obtain the Bayes estimate for Bayes estimate for ~, say " .~, ~. IE this case, the obtained wil.l not minimize the overall risk (Bayes risk) for all possib.le values of the Bayes risk for a special case when On the other hand, when 2 0" 0" 0" 2 rather it only minimizes , 2 s 2 is unkrlOwn as assumed in Models B '::J (~,O"~), and D, the Bayes estimate for , denoted by minimize the Bayes risk over all. possible values of (~, 0-2 ) , wi 11 2 (~, 0" ) • Note that for al.l models A, B, C and D, the knowledge of the exact form of the prior density must be assumed to obtain the Bayes estimators. If there is no knowledge at all about the prior density, Jeffreys (1961) suggested to use a uniform prior density as follows. When the range of a parameter e is from -c to c, where and may take value +00, the prior density is taken to be For the case where eE(O,C) density of log lei or 6E(-C,O), where c > 0 g(e)o:;l. C > 0 , the prior is taken to be uniform, ~.~., f (log Ie \ )0:;1 or gee) 0:; \e\-l. Applying Jeffrey's suggestion of uniform prior density for Mode1S A, B, C and D, we have: 23 Model A: g (~)cx:l , since _ ~ E E :P (p dimension Euclidean srace), Model B: ··1 g (~, cr )cx:cr· , since and cr>O. Models C and D: 2 -1 g (X"~' cr )cx:cr· , since 3.4 X c E'D<p , ~ (:.E p and cr > 0 • Empirical Bayes Estimation Recal.l that the Bayes estimators for Models A, B, C and D cannot g(~) be obtained, unless the prior densitips for Model B, 2 g (X,~, cr) for Mode.l A, for Model C and Mode 1 D are known. 2 g(~,cr ) In practice, this is usually not t.he case and it is unrealistic to assume a specific form of a prior density. In this section, this assumption about the knowledge of the prior density will be rel.axed and a two-stage Bayes estimator, which is called empirical Bayes (~stimator, will. be derived. the four models discussed in Section denoted by Model. A*, B*, C* ~Dd D*, 3.3, the following four models, will be considered. Model A*: Y.,,=X ~. -],-1. ']Xl 'D<:p pXl where + c. -]. ']Xl ' In accordance with i 1, 2, •.• , N 24 ~l' ~2' ... , ~N (a) are independent with a 2 common distribution 2 (X,o-) (b) all is N(.Q,a I) fixed and known, common for i = 1, 2, ... , N. The matrix X is of full rank; ~l' ~2' (c) ~N 0." are unknown independent realizations of a random variable an unknow~ with ~ g(~). prior density This is the model analyzed by Martz and Krutchkoff they derived an empirical Bayes estimator for We shall review ~N' their derivation below and obtain an empirical Bayes 13. (i = 1, 2, .. u, N) N) , by g(~) been known, the Bayes estimate for = -b.l + 2 0- -1 h 13. -l (i 1, 2, ... , (X'X)· (1) (£. IX, (j2 ) g l h (b. g Now, since 2 (X,o-) -l IX, is common for all (£l'~l)' (£2'~2)' •.. , of a random variable equation for (3.18), would be given by p. -l consider ~stimate 0 -l Had (1969), where (£,~) (£N'~N) (3.31) 2 (J ) i = 1, 2, •.• , N, we can as independent realizations whose joint density is denoted (recall (3.16) for Model A of Section 3.3) by E. 2 , Therefore, we can also consider that £.1.' realizations of a random variable whose (marginal) density is b ••• , £N are independent 25 hg(~lx,~2) in (3.16). denoted by common for all i 1, 2, .•. , N , we can. just denote this marginal h (b) . density by g- Noting that variable (X,~2) is For simplicity, since b hg-l (b.) is a multivariate densi ty of a random evaluated at b = E.i ' and ~l' ~2' •.. , ~N are observable and independent, several methods for estimating are available such as the method described by Cacoullus Let f(~) ~ = x. -l ! 3 A quadratic mean consistent asymptotically unbiased estimate for independent observations g (1966) below. be a multivariate density of a random vector (]Xl) , evaluated at h (b.) !l' !2' .•• , f(~), ~ P N II D.. based on the sample of ~, of ... , 1 and x -x .p D. is given by kp) (3.32) P j=i J where IKCyJ I <00, (a) sup .lEEp (b) SI KCyJ IdX < 00, (c) ~.~. , K is bounded; K is integrable; \y\PK(y) = 0 , where lim Ixl-' ~.~., oo length of vector Iyl denotes the X 3 Consistency here is in the sense of mean squared error consistent, ~.~., lim N-.oo E[fN(~)-f(~)J2 = 0 .. at every continuity point of f . - N 26 (d) SK(I)dl = 1 ; (e) b.. = b.. (N) satisfies J lim b.. (N) = 0 N-+ CO J J lim N[b. . (N)J P N-+ co J = co, for j = By using £1' £2' ..• , £N as and 1, 2, .•• , P • N independent observations and setting 7~ l > x -x. *# •• , P l 1 P (2II)p j=l 2 . k J] 25 j Sln p) = --:::- II b.p b .. -b [lJ • (3.33 ) [bij-bkjJ 25. J and defining . (bij-bkj ) 25. Sln _ .........._,....;JII--_ = 1 , for b .. -b . (lJ k J) k =i , 20 j a consistent and asymptotically unbiased estimate for hg (b.) , -l therefore, is given by 2 l 1+ N L: P II b .. -b k=l j=l (lJ kfi 25. J where (1) 5. = 0 (N)cr J \/t (X/Xf 1 } JJ..' ; . kJ ] (3.34 ) 27 (2) , (X/X)-l}jj is the (j,j)th element of (X'X) -1 ; (3) .. 0 (N) = 0 and lim 0 (N) satisfies N... ex:> lim N[6 (N)]P N... ex:> 1 6 (N) = N) q = ex:>, for example, take q > p • wi th Note that all of the conditions (a) to (e) given in met by the function K of (3.33). (3.32) are We might use functions other than (3.33) as long as it satisfies conditions (a) to (e) given in (3.32); the choice of (3.33) is made so that this work will be comparable to that of Martz and Krutchkoff (1969). At this point, a study to find a function that will give better empirical Bayes estimators, if possible the best in some sense, is still ~en. of (3.32) is a Furthermore, observe that p X 1 vector and can be wri tten as h(l\b.) g -~ where b. -~ = [~bO () il h (b.), •.. , ~bO h (b. )JI g -~ () ip g-~ = [b. l , b. 2, ..• , b. J/. ~ ~ By the definition of the first ~p derivative of a function, (3.35) .can be expressed as 1 h ( ) (b.) g -~ = [hg 1 (b. ), ... , hgp (b.) J' -J. (3.36 ) -~ where h . (b.) gJ -~ h ([b. l' ••• , b. . + g ~ = lim --!ii!._= €... O ~J ~ € €, ••• , b. JI) - h (b.) __ ~p --:;;~ g -~ -.:.L~ 28 for , = 1, 2, •.. , p j By choosing E such that lim E =0 for example we can choose N... co E = 5.J of (3.34) -- and using (3.32), h .(b.) gJ of -J. (3.36) can be estimated consistently by which in turn gives a consistent estimator for h-~l) (b.) -J. -1'4 An as follows: [h-_l(b.), ... , -l.\J"p h-_ (b.)J I -1'4 -J. -J. = empirical Bayes estimator for (3.38) • is obtained by substituting ~. -J. (3.34) and (3.38) into (3.31), namely , i = 1, 2, .•• , N • By using an argument as given by Rutherford and Krutchkoff (1969), ~. it can be shown that the overall risk of "~. the Bayes risk of of -J. asymptotically optimal. (3.31), as ~. -J. of (3.39) converges to N ... co, ~.~., Rutherford and Krutchkoff ~ ~i of (3.39) is (1969) gave a set of sufficient conditions for an empirical Bayes estimator to be asymptotically optimal, when the squared error loss function is used. Denoting e, a and ~ e as the Bayes and empirical Bayes estimators for respectively, those sufficient conditions are given at the top of the next page. 29 " plim "e (a) = "e . N-+co , (b) )' > 0 For some we have t hat "" ~i Indeed, of and some real number E( \ e \2+)') ~ M < CO M < co , • (3.39) can be shown to satisfy condition (a) as follows. ~(£i) By the consistency of Cacoullus' estimators and ~l)(£i)' we have that h (b.) g -~ and Therefore, for ~ plim~. N-+co -~ = b. -~ ~ ~. -~ of (3.39), we have = -~ b. + + Further, by imposing a mild condition on the prior density such that their third absolute moments are bounded, we have that condition (b) is also satisfied. Model B*: Y...~ = ':IX 1 where x ~.+ -~ 'D<p pX 1 §.~, ..... 'D<l i 1, 2, •.• , N g(~) 30 (a) (b) E. is distributed as .§.2' ... , -~ 2 N(O,ao.I) - and .§.l' ~ are independent; .§.N X is fixed, known and common for all i = 1, 2, ..• , N ; independent realizations of a random variable (~, 0"2) g(~, 0" 2 wi th an unknown pri or densi ty ) • Note that once the random variable (~,0"2) takes a value, say 2 (~.,O".), -~ ~ it is fixed for regression equation. f3. (i = -~ Had successive observations in each ith T g(~,O" 2 ) been known, the Bayes estimate for 1, 2, .•• , N) , from (3.25), is given by A f3. = b. -~ -~ (3.40) and the Bayes estimate for o-~ 2 = ~ _S_i_ f T- E- 2( O?·i' Si1xr'X)) T-p-2 f -p -2 b.,S.~ X T Observe that since can consider 2 0". , from (3.26), is given by X is common for all (3.41) i = 1, 2, ••• , N , we 2 2 2 (£l'Sl'~l'O"l)' (!?2,S2'~2'0"2)' ... , (!2-N,SN'~N'O"N) independent realizations of a random variable 2 (£,S,~,O") as whose joint density is denoted (recall equation (3.22) for Model B of Section 3.3) Thus, can also be considered as independent realizations of a random variable (£,S) whooo (marginal) density is fT_P(£,sIX) from (3.22). 31 Thus, (T-p) :. f -p (b.,S. IX) T -1 1 degrees of freedom of a random variable (b,S) = (b.,S.). -1 1 Similarly, multivariate density with variable evaluated at i t:: is the value of a multivariate density with 1, 2, ..• , N, f -p- 2(b.,S. Ix) -1 1 T (T-p-2) (b.,S.). -1 1 evaluated at is the value of a degrees of freedom of a random Since f -p (b.,S.\X) T -1 1 (£,S) and X is common for all f -p- 2(b.,s.lx) -1 1 T can simply be denoted by f -p (b.,S.) and f -p - 2(b.,S.), respectively. -1 1 T -1 1 T Again, in this case, Cacoullus' method can be applied to obtain consistent and asymptotically unbiased estimates for f T-p (b.,S.), -1 1 (1) f T-p- 2(b. ,S.) and f -p- 2(b., S.) , which will be denoted by -1 1 -1 1 T (1) (1) f T (b"Si)' f T 2(b.,S.) and f T 2(b.,S.), respectively. N, -p ~ N, -p- -1 1 N, -p- ~ 1 Therefore, SUbstituting these estimates into (3.40) we obtain an empirical Bayes estimate for ~ ~ ... A t:.i ' (3.42) = -i b + Similarly, by substituting into denoted by f N, T-p (b.,S.) -1 1 and f N,-pT 2(b.,S.) -1 1 (3.41), we obtain an empirical Bayes estimate for 2 ~. 1 , denoted by The derivation of f (1) T 2(b.,S.) N, -p- /~1 1 f N, T-p (b.,S.), -1 1 will be presented below. S. 1 = T 2 1:: e .. j=l 1J f N, T-p - 2(b.,S.) -1 1 Since and 32 has (T-p) T (b.,S.) N, -p - l l as follows: degrees of freedom, (3.34), away, analogous to can be obtained right f 1 ,Sol ) , -p (b. -l fN T (2TT ) p+L_ P n 6. . -n'Y . 1 lJ J== ( t 2 S'-Sk 2". ] np 'Y sine N 1 + E k==l kfi S.-S [l k ~J . J==l . cbij-bkjj Sln 2J: U •• lJ kJ ] b. -b . [lj (3.44 ) 26 .. lJ where (1) 'Y == (2) § == - (3 ) 6(N) ~- 1 E N l= . 1 (S.-S) 2 l 1 N E S. ; N l:;:: . 1 J: u (N) ij == U l: (4 ) (5) N l _ ~ S. 1 ..2:-r (X 'X ) - } . T-pt jJ is the 6 (N) satisfies lim N[ 6 (N) N-+ co 1 t+1 (j,j)th lim (N) == 0 N-+co =co, the random variable and for example, take 6(N) == N q, where q > p The derivation of element of f , T-p- 2(b.,S.) -l l N + 1 • is rather complicated, since 33 = S has (T-p) variable T .E e. j=l J degrees of freedom. S* 2 However, we can construct a random a random variabJ,.e with derived from S, i j' = 1, S* possibili ty of which pair of free ej's. , 2, .•. , T . T > p + 2, But, for of (3.45) gives more than one e.' s to be thrown away out of J For uniformity, it is suggested to take T. j S* (T-p) as the (S-e~-e: )'s and all possible (T-p) free J JI average of all possible e 's out of (T-p) ~.~., S* = S _ e 2 - e 2 1 j j j degrees of freedom -- S, for example by throwing away two out of degrees freedom of where (T-p-2) Suppose the first (T-p) of e.'s are free, then we J have s* = 1 T-p T-p 1: .E (T-p) j=l 2 1 = (T-P) 2 j j/=l J J < j' T-p .E 2 2 (S-e. - e. 1 ) T-:p T-P .E [.E j '=1 k=l j=l j < j 1 kfj k:/=j 1 T-p T T T-E-2 2 2 T-E-2 2 2 '" T.E e k + .E ek = S + T-P .E ek • p k=l k=T-p+l T-p k=T-p+l (3.46 ) 34 Since the free overall average of ej's can be any S* in (T-p) out of T of ej's, the (3.46) is given by s* = T-p-2 S + 1 ~ ~ (T-l)e 2 = [T-E-2 + ~]s = T-2S T-p ( T ) T-p k=l p-l k T-p T(T-p) T T-p By using the random variable degrees of freedom, instead of again analogous to S* (3.47) which has of S, we can derive (T-p-2) f , T-p- 2(b.,S.), -~ ~ N (3.34), as follows: 2 fN T , -p- ( 1 2(b.,S.) = --..:;;:;..----~ ~ p+ L P (2II) ~'Y II 5 .. j=l ~J (Sin[ k S.-S* + N. L: k=l k/=i S.-S* [~ K] 2y J k P . II J=l in[~J) S.-S'¥" [~ 2y ~] b..-bkj ) 2] sine ~J] 25 .. ~J (3.48 ) b .. -b . [~t .kJ ] ~J where (a) Sf T-2 = ~ Si ' for (b) 'Y and 5ij i = 1, 2, ... , N ; are as defined in (3.44) f (1)T 2(b.,S.) is obtained, analogous to N, -p- -~ ~ making use of (3.48), as follows. Finally, (3.38, ) by fN(lT) = [fN,· l , T-p- 2(b.,S.), ••• , f N, p, T-p- 2(b.,S. )], , -p- 2(b.,S.) -~ ~ -:l. ~ -~ ~ (3.49) 35 where f N,J, . T-p- 2(b.,S.) -2 2 1 = --~--(fN v , T-p- 2([b·2 l , ij . • •• , b.2p J', S.) - f 2 N, . , S.2 )} T-p- 2 (b -2 for j "" 1, 2, ..• , p . Before we discuss Model C* and D*, x = c..~,X*J , where X* = [!2' obse~ve ~, ... , ~J that, in general, . Model C*: Yi = 'D<l ~i Xi = 1, i 'D<p pxl 2, ••• , N or ~i 'D<l =: [1, Xf J ~i + §.i ' 'lXl 'IX (p-l) pXl 'D<l = 1, 2, ..• , N i where (a) €. -2 (i = 1, and §.l' §.2' ..• , §.N (b) (X!' o-~), is distributed as 2, ... , N) (X~, o-~), are independent; 2 ••• , (Xj,o-N) are known independent realizations of a random variable ~l' ~2' ••. , of a random X* , ~ g(X*,~,o- and 2 ~ (X*,0-2), while are unknown independent realizations variable~. 0- 2 N(O,o-.I) 2 2 The three random variables have an unknown prior joint density ) ; (c) X is a matrix of full rank. 36 Note that the assumptions for Model c* is the same as those for g(X*,~,~2) is unknown. Model C, except that the prior joint density Tne parameters to be estimated in Model C* are g(x*,~,~2) been known, the Bayes estimate for ~l' ~2' ••• , ~N' ~. -~ Had (i = 1, 2, •.• , N) , from (3.29) is given by 2 ' -1 '" ~ = b. + ~. ( X.X.) ... -~ ~ * h (1) (X. , b . , ~.2 ) g ~-~~ (3·50) 2 h (Xt,b.,~.) ~ ~ g ~ -~ ~ where, as before, (1) 2 0 2 hg (X1E",b.,~.) ="'-::-b hg (X1E",b.,~.) • ~ -~ ~ 0 h • ~ -~ ~ -~ Again, by a similar argument as for Models A* and B*, we can (Xl'£l,~i), (X~'£2'~~)' consider realizations of a random variable 2 hg (X*,b,~ ). to obtain a .•• , ~ -~ 2 (X*,£,~) as independent whose density is Therefore, Cacoullus ' method is again applicable here consi~tent and asymptotically unbiased estimate for h (X~,b.,~~) , a multivariate density g (XN'£N'~~) ~ (X*,b,~2) = (Xt,b.~~). ~ -~ ~ h (X*,b,~2) evaluated at g - Denote this estimate by h (Xt,b.,~~) • Note that we can write Xf. .= [~;2' .•• , x .. , ••• , x. J, .... -~J -~p for i = 1, 2, ••• , N where x .. = [x.~J'1' ••. , x ~Jm .. , ••• , x.~J'T]', . -~J for j 2, 3, •.• , p , which allows us to write N ~ -J., ~ 37 hg (X-tE-,b.,cl) == hg (x. , ... , -~p x. ,b.,cr~). ~ -~ ~ -~ 2 ~ ~ Applying Cacoullus' method, analogous to (3.44), we can obtain N I: k==l k/i. (3.51) where (1) Y == 5 (N) (2) cr -2 1 N 2 == - I: cr; ; N . 1 ... ~== lljm (4) = o(N) _ 1 x. • Jm O. . ~J (6 ) -2)2 ( cr2 -cr i I-~----- \~ i~l N I: x N i==l ijm =- = 0 (N)cr. \ Ir-(X-!-X.-)--l-}..~ ~ ~ JJ \It f(X!X. )-l} .. t l (xijm-X.jm)2 ~ JJ is the (j,j)th element of - (X!X. )-1 ~ l 38 (7) 0 (N) satisfies lim 0 (N) = 0 and N.... oo . ( ) (p-l)T+p+l lim N[ 0 N J N.... 00 o{N) While (1) lL~ -~ 2 (X~,b.,O".), J. -J. e; for example take = N q ,where q > {P-l)T+p+l . J. h (1) (X.* , b . , 0".2 ) estimate for =00, 1 l -l J. a consistent and asymptotically unbiased is obtained, analogous to (3.49), as follows: [h--l(X~,~,O"~), ... , h-_ (Xt,b.,O"~)J' N J. ~.l. J. Np J. -J. J. (3.52) where ... , lL_. (X~, b., O"~) = ~[lL_{X~, [b.J. l' J. -J. J. u. . -~ J. -~J b .. + 0.. , •.. , b. JI ,O"~) J.J J.J J.J J.p J. for j = 1, 2, ..• , p . Finally, by substituting (3.51) and (3.52) into (3.50), an empirical Bayes estimate for ,., '" ~. = b -J. -i + 2 0". J. (i = 1, 2, ... , N), is given by ~. -J. (X!X. ) 1 - * h (1) \ (X. , b. -~ J. -J. 2 ,0". ) J. 2 h-_(X~,b.,O".) J. J. . N J. -J. J. Model D*: VJ.' :J.. 'D<l =:: X.J . - 13. J. 'D<p IP< 1 + E. -J. 'lXl or, as in Model C*, we can write , i = 1, 2, ... , N (3.53 ) 39 li ']Xl = [1: , xf ] ']Xl 'IX (p-l) + , E. -~ 'lXl i = 1, 2, ... , N where (a) E. -~ is distributed as 2 N(Q, (Ji I) ... , .§.l' .§.2' i = 1, 2, N, and , for ... , .§.N are independent; (b) ... , x* are observable (known) inN dependent reali~ations of a random variable Xi, X*, 2 X* , while ar~ unobservable (unknown) independent realizations of a random variable (~, The three random variables and X*, ~ 2 (J ) . (J2 have an unknown prior joint density g(X*,~,(J (c) 2 ) ; X is a matrix of full rank. In this model, the (~2,(J2)' ••• , 2 2 (~N,(JN)' par~eters to be estimated are Observe that the assumptions for this model is the same as those for Model D, except that the prior joint density When ... , g(X*,~,(J2) is unknown. 2 g(X*,~,(J) N) , from is known, the Bayes es timate for 13. (i -~ = 1, 2, (3.30), will be given by (3·54 ) 40 2(.1 and the Bayes estimate for (J'. ~ :::; 1, 2, ..• , N) , from (3.31), will be given by S. f T-p- 2(X1E-,b J. - i ,S.J. ) ~i - T-p-2 f (X~,b.,S.) T-p J. -J. J. " 2 ~_ (3.55 ) By the same arguments as before, we can consider variable (X*,£,S) whose (marginal) density is f T-p fore, Cacoullus' method is applicable for estimating f -p- 2(X~,b.,S.) J. -J. J. T and fN,T_P(Xf'£i,Si) , fT(l) -p- 2(X~,b.,S.) J. -J. J. f N, T-p- 2(X~,b.,S.) J. -J. J. as before. and fN(lT) , -p- (X*,b,S). f T-p There- (X~,b.,S.), J. -J. J. Denoting 2(X*.,b.,S.) J. --J. as J. their respective (Cacoullus) estimates, empirical Bayes estimators for ~. -J. 2 and ~. J. will be given by substituting these estimates as before into (3.54)and (3.55) as follows: -1 A ~ . -J. S. (X!X. ) J. J. J. = -J. b. + --=-T--=::;"":;2::""--p- and "2 _ _ _ S.J._ f N, T-:8- 2(X~,b.,S.) J. -J. J. fri - T-p-2 f T (X~,b.,S.) N, -p J. -J. J. (3.57 ) Needless to show algebraic derivations, since they are analogous to those described in Model B* and C*, we have the following results: 41 f T N, -p 1 (X~, b. ,8. ) 1. -1. 1. N ~ k=l kfi . [Xi jm-XkjmJ) s1.n i p T n n 26 jm ( j=2 m:;::l b .. -b . [1.J2~ kj J s1.n p Vij n b .. -b , [1.J k J] j=l 26 .. 1.J wbere (1) Y = 6(N) _ (2 ) 8 (3 ) 6 (4 ) 1 =- \j 1 -N N _ 2 ~ (S.-8) . 1 1. 1.= N ~ 8. N i=l 1. is as defined in (3.51) ; 8. 1 6., = 6(N) T1. ((X!X.)- } .. ; 1.J -p 1. 1. JJ jm (5 ) ((X!X. )-1},. 1. 1. JJ (6 ) 6(N) is tbe (j,j)th element of sa.tisfies - lim 6(N) =0 (X!X. )-1 1. 1. and N... oo lim N[6{N)] (p-1)T+p+1 '" 00, for example we N... cn 42 1 can take q =N q, where > (p-1)T+p+1 . T 2(X~,bi'S,) ,-p~.,.. ~ fN 5(N) 1 = ---------------=~------------- (2II)(p-1)T+P+~y( ~ ~ 6. ) ~ 5.. j=2 111=1 JIll j=l 1J S'-S~02 Sin[~J ( S. -S*k . [Xi jIll_X kjIll ] ) 12 ] ~p T s~n 2· 6. ---.....:::;.y"--II II JIll S. -S'lE" . 2 1 x .. -x , [~] J= m= [lJm k JIll] 2y 26 jm . [ s~n N + I: k=l k/=i P S.-S1E" [-2:.-2.J 2y b .. -b . sine 1J k J] 25 II j=l 2 . ij where and * (1) T-2 S Si = T i' (2) y, 5 , 6 jm ij for and l' _- 5(N) 1, 2, .•• , N ; are as defined in (3.58 ) fN(ll 2(X~,b"S,) ,-I--P~ -;L ~ ~ [fN 1 T 2(X~,b.,S,), ~" , -p~ -~ ~ ••• , f N,p, T-p- 2 (X1E-, b, , S,~ ) l' ~ -~ (3.60) where 1 f N,J, ' T-p- 2(X~,b"S.) --(fN,-I--pm 2(X1E-,Cb'l' 1 -~ ~ ~ --~ U,. ~ ~ :LJ ... , b, ] ~p for j ~ I , S.) - f ~ T 2 (X~ , b , , S, )} ~ -~ ~ N, -p- 1, 2, ••• , P • 3.5 Remarks On Empirical Bayes Estimation As indicated by the assumptions, Model A* is applicable when we ' have a se t a f regress:Lon equat'~ons W:L'th common known. o2 (X,cr ) and cr2 4S ~ Martz and Krutchkoff (1969) pointed out, by means of a simulation study of a simple linear regression model, that the empirical Bayes estimator for ~ '" (denoted by ~N ) is better than the ordinary least squares estimate (denoted by £N ), in the sense .... that the average squared error for '" ~N is smaller than that of The more improvement will be gained, the larger N £N. (the number of experiments ). Note that Model B* is a generalization of Model A*, since if the random variable 2 cr in Model B* is degenerate of a known value Model B* will become Model A*. Similarly Model c* is also a generalization of Model A*, in this case if the random variable 2 (X*,cr) of Model c* is degenerate. generalization of Model B*. Model D*, however, is a cr 2 , 44 By assuming the and ~ 2 X matrix to be a random variable besides , as described in Model C* and D*, empirical Bayes estimation for these models will be more cumbersome as the ~ X matrix) becomes larger. TX P (the dimension of By putting some restrictions on this X matrix, we can obtain special cases of Model C* or D*. For example, the fol~owing model, denoted by Model TIt' is a special case of Model D*. ,Model Dt: ,x~J ::;: [_1 4 -'l. 'D<l 'JXl 'D<l ~. -~ + E. , -:L i 'D<l ~l 1, 2, ... , N where the assumptions are the same as those for Model D* with additional restrictions below: (a) T is an odd number; (b) The elements of x (c) x i2 ::;: iT x1E" x il ::;: x i3 - 2.5x , for il are equally spaced, -~ x ... ::;: i2 i ::;: ::;: x 1, 2, . .., iT .!.. ~. , xi,T-l N . The chaise of restriction (c) will be discussed later at the end of this section. Since is odd, we have a middle element of T By restrictions (b) and (c), be expressed in terms of c. ~ 7(T-l) ~m can as follows: i 4T+6m-10 xim ::;: x. ci ' :;:: 1, 2, .•• , N (3.61) m ::;: 1, 2, •.. , T From the above restrictions, we see that random variables and c assuming density have a one~to-one 2 (~*,~,O" ) correspondence. x* Therefore, instead of as a random variable with an unknown prior 2 2 g(~*,~,O" ) , we can as well assume that 4 random variable with an unknown prior density (c,~,O" ) g( c,~, 0" 2 ) is a • Cacoullus' estimates of the required multivariate densities as given in (3.58), (3.59) and (3.60) will become f 1 T (c.,~,S.):=: - - - - : : ; . - - - N, -p ~ -~ ~ 2 P n (2Il )P+ N)'t. . 1 J:=: &.. ~J 2 s. -Sk N I: k=l k}i Sin[?r O [Si-Sk 2'Y J c.-c k ] Sin[-k-J [ci-c k ] p . Sln . ) b .. -b k J] [~J n 2& .. ~J j:;;l 2t. where (1) A = 6(NlVl:.N. ~ ~= (2) c (3) )' 1 - 2 (c. -c) ~ N 1 I: c N ~= . 1 i and &•• ~J are as defined in (3.58); 4 2 2 The symbol g in g(~*,~,O") and g(c,~,O") do not necessarily represent the same functional form, it-merely indicates the prior density function. 46 (4) 6 (N) satisfies lim 6 (N) =0 and N.... ex:> N[ &(N) ]P+2 lim N.... ex:> = ex:>, for example take 1 - - 6 (N) = N q > p + 2 . q, where 2 f T 2(c.,b.,S.) N, -p~ -~ ~ 1 = -----+2 P (2rI )p N)' 6 n 6.. . 1 J= N + ~J S'-S~0 sin[-¥J Q S.-S~ [.2:...-..2:. J 2)' p . s~n b .. -b k . ) 2 [~J 21:V • • JJ n t k=l . 1 J= k/=i ~J b .. -b k · [~J J] 20 .. ~J (3.63 ) where )', f 6 ij , 6 and (1) T 2(c.,b.,S.) ~ -~ ~ N, -p- 6(N) = are as defined in (3.62). [fN , 1 , T -P- 2(c.,b.,S.), ... , ~ -~ ~ f N,p, T )J' -p- 2(c.,b.,S. ~ -~ ~ where ..• , b.~p J/,S.) - f , T -p- 2(c.,b.,S.)} ~ ~ -~ ~ N for j = 1, 2, ••• , P . (3.64 ) EB estimators for 2 and ~. -~ rr.~ , from (3.56) and (3.57), are therefore given by ~ S. (X!X. ) ~. -~ ~ -1 ~ (1) f N, T-p- 2(c.,b.,S.) ~ -~ ~ :;:: b. + --=T=---2-=---~ -pf N, T-p- 2(c.,b.,S.) ~""""l ~ ~2 S.. rr ",,--=~,::" f b . , S. ) N T -p- 2 ( c ~. , -~ ~ (3.66) ' T-p-2 f , T-p (ci,b.,S.) -~ ~ N i where ~ X. :;:: [l,x~J . --:I. ~ The essence of restricting the X matrix in Model D*, is actually to make it possible to have a one-to-one correspondence between matrix and a scalar variable simplified. X c, so that the estimation is much more Other kinds of restrictions on X might as well be imposed to have a one-to-one correspondence between X matrix and a scalar variable. The matrix T X (p-l). X*, which is one-to-one to X, has a dimension of X* Any restriction on one correspondence between dimension of X* Y is less than such that we can have a one-to- and a multivariate T X (p-l) Y, where the will, therefore, simplify the estimation, in the sense that we have variables with less dimension in the joint prior density. Now, some arguments for choosing restriction (c), ~.~., x iT :;:: 2.5xil (i :;:: 1, 2, ••. , N) , will be given below: Consider a uniform distribution 1 f(x) :;:: [ b-a ' o , for elsewhere 48 The mean and variance of (3.67), denoted by given by g x = ~(b+a) 212 ~x = 12(b-a) . and g x and 2 ~x ' are Therefore, the coefficient of variation will be given by ~ c. v. = - x = fJ. x b-a ·577350 b+a (3.68 ) Relation (3.68) can be rewritten as b a 1 + l.732052 C.V. 1 - l.732052 C.V. and is tabulated below for several values of C. V. Table 3.l C. V. -ba .05 1.2 Relation of C.V. and b/a .20 .10 1.4 2.1 If we design the .25 2·5 b = 2.5a. 3.2 .40 5·5 ·50 l3· 9 ·55 41. 2 .60 ·577350 -52.0 ro X matrix such that the elements of uniformly distributed with that ·30 x~ -~ are C. V. = .25 , then from Table 3.1 we have In practice, many independent variables used in regression analysis in economic studies have C.V. between .20 to ,30. When we are dealing with an independent variable whose C.V. is not close to .25, the restriction (c) can be revised accordingly by using the relation given in Table 3.l. Further designing of the X matrix is by imposing restriction (b) to have an equally spaced elements of x~. -~ Recall that a part of designing an experiment is designing the construction of the X matrix. 4. COMPARISON OF EMPIRICAL BAYES TO ORDINARY LEAST SQUARES ESTIMATORS In this chapter, a simulation study will be presented to compare empirical Bayes (EB) to the ordinary least squares (OLS) estimators. The OLS estimators are chosen for comparison, since by the assumptions 3.4, particularly the for Models A*, B*, C* and D* in Section ~l' ~2' independence of estimators (Theil, 1971). pt, work with Model ..• , ~N ' they are the best linear unbiased For simplicity of the study, we shall only as described in Section 3.5. Let us rewrite Model Dt as follows: = 1, i where , 1:i of lIs and and E. -l and x. -1. Q'i , ~i E. -1. are 15 X are scalars. are as given for Model 1 (4.1) 2, ... , N vectors, 1 is a Assumptions about Dl in Section 3· 5. Q'i 15 , 1 X ~i Since here vector , -l T = 15 x. from (3.61), we can write x. lm (the middle element of and i 1, 2, ••• , N m 1, 2, " ., 15 (4.2) for x. ) . -l In addition, for ease of programming, the random variables and ~ 2 c, are chosen to be independent, although according to the assumptions for model D*, it does not necessarily require so. generation of random variables c, Q' , ~ and The ~2 will be described , 50 in later sections of this chapter and several cases concerning the form of the prior density g(c,a,~,~ In each case, the whole set of (experiments) will be replicated r 2 ) will be considered. N regression equations = 100 times and the values of are varied to be 2, 3, 4, 5, 10, 15, 20, 30 and 40. replication we generate at random N values of N Note that in each (c,a,~,~ 2 ) , so that in general, it will be different from replication to replication. 4.1 Objectives Of The Study Bayes estimators for a mUltiple linear regression model as seen in Section 3.3 are dependent on the prior density. Since EB estimators as given in Section 3.4 are actually two-stage Bayes estimators, therefore, they will indirectly depend on the true form of the prior density, although the knowledge of the exact form of this prior density is by-passed in their derivation. Therefore, we may suspect that the magnitude of improvement for EB over OLS estimators, if there is any, will somewhat depend on the true form of the prior density. This study will attempt to show that no matter what form the true prior density was, an improvement for EB over OLS estimators will always be gained, ~.~., EB estimators are at least as good as the OLS and in some cases they are much better in the mean squared error sense. The ratio of the average squared error, denoted by R ABE ASE EB estimator OLS estimator (4.3) 51 will be used as a measure of the improvement for EB over OLB • estimators; the smaller of ~ 2 ~ ~2 (a.,b.,s.), (~.,~.,~.) as the 018 and EB estimators for Denote 2 (~i'~i'~i) the larger improvement gained. R l l l l l l of the model given in (4.1), respectively. ~.,~.,~~ (i = 1, 2, ... , N) l l l 2 The true values are known by generation of random 2 ~ ~ ~2 variables ~,~,~ computed. The average squared errors for EB and 018 estimators re- , while (a.,b.,s.) l l (~.,~.,~.) and l l l l can be quired for obtaining (4.3) are computed as follows: A8E(a) (4.4) r ~ N l = 1 Z [-N Z (~. - ~l' )2 Jk r k=l and similar expressions for average squared error for and ~ ~ i=l ~ 2 Hence, the ratio of the is given by A A ASE~~ R~ = A8E a wi th similar expressions for l R~ and j (4.5) R ~ It is also an objective of this study to investigate the relation between number of N and R. In practice we are usually dealing with small N, say around 10, because if N is very large, it is im- practical to report individual estimates for regression parameters. In this case (if N is large), estimates for the means and variances of regression parameters, for example as given by 8wamy (1970), would be desirable. We can expect that the larger since the larger N, the smaller N the better the estimation of the required multivariate densities to obtain EB estimators. R, 52 4.2 • Generation of Random Variables c, a, ~ ~2 and c, a , For ease of programming, the random variables ~ and ~ 2 are taken to be independent, although the assumption for the model does not necessarily require so. density of c, a , and ~ individual prior densities, ~ 2 This means that the prior joint can be expressed as the products of the ~.~., Several forms of prior densities such as normal, U-shaped and L-shaped with given mean and variance will be used for the generation of values of Table 4.1 c, a, ~ and ~2, as described in Table 4.1. Several forms of prior densities of a studied c, a, ~ and ~2 2 b g4(~ ) Cases gl(C) g2(a) g3 (~) 1 2 3 4 5 6 7 8 9 N(10,9) L(10,9) US(10,9) N(10,144) N(10,9) N(0,9) N(-10,9) N(lO, 9) N(10,9) N(10,9) L(10,9) US (10, 9) N(10,9) N(10,144) N(10,9) N(10,9) N(0,9) D(lO,O) N(5, 2. 25) L(5,2.25) US(5,2.~5) N(5,2.25) N(5,36) N(5,2.25) N(5,2.25) N(0,2.25) D(5,0) N(50,225) L(50,225) US (50, 225) N(50,225) N(50, 225) N(50,225) N(50,225) N(50,225) N(50,225) ~=Normal, L=L-shaped, US=U-shaped, D=degenerate, the numbers within the bracket indicate the mean and variance, respectively. bIn the case of g4(~2) is normal, since ~2 > 0 , the prior density is taken to be .000008 + N(50,225) , =[ o for o< otherwise ~ 2 < 100 53 The L-shaped and U-shaped distributed random variables are derived by linear transformation of random variables W and Z, whose distributions are as follows: 1 (W_33 )2 few) := ( o , for , otherwise (4.6) and 1 z 2 , 1 -(1.5)3 < z < (1.5)3 for f(z) := ( o , Note that respectively. otherwise W and Z are L-shaped and U-shaped random variables, By the following linear transformations of we can obtain an L-shaped random variable standard deviation and a U-shaped random variable cry /.LV ' standard deviation y := with mean y W and Z, /.L y , V with mean cr ' V 4(3 - t. (4.8) \(5)cryW + /.L y - crY,fr 1 5 ~(3 - b\[5 )cryZ V 4.3 + /.LV • Computations Of The 018 And EB Estimates By the generation of random variables case described in Section .•. , N). From we can obtain c. , the elements ~ can be derived by using error component 4.2, €. -1 , (4.2), since €. -1 x. (m:= 1m and also from c, Of, f3 and cr 2 2 for each ci ,Of i ,f3 ,cri (i := 1, i 2, 1, 2, .'0' 15) of x. ~ cr~, we can generate the 1 is distributed N(o,cr~I). -1 01 a. 1 [~i] , denoted by The 018 estimate for [b~J, by using (3.11) is computed as follows: a. [b.~J (4.10) ~ Further, (4.10) can be rewritten as 1 15 2 15 a i = OI i + n-( E xi E i m=l m m=l =~. + b. 1 1 1 --D (15 15 E 15 E x. i m=l 15 (4. 11) E x. E x. E. ) im - m=l 1m m=l ~m ~m E. 1m 1m - 15 E x. m=l 15 E 1m m=l (4.12) E. ) 1m where 15 D. = 15 2 15 E x. - (E x. ) m=l 1m m=l 1m 1 2 ~~1 from (3.12) is given by The 018 estimate for 2 1 1 s. = -T (y. - X.b. )/(V. - X.b.) = - -p ~ where 1 Since we have 2 1 1-1 ~1 (X~X. flx!~ 1 ~ M. = I - X. ~ 1 T-p 1-~ and I is a 15 2 15 E x. m=l TXT identity matrix. ] , (4.13) can be expressed as = 15, Xi = [l,x. - -1 T 1 15 2 15 15 E x. E. s. = 13 C E E. - D--[ E x. (E E. ) 1 m=l lm 1. m= 1 ~m m= 1 1m - 2 (4.13 ) E!M.E. -~ ~-~ 15 E 1m m=l E. 1m m=l 1m 1m 2 15 2 + 15( E x. E. ) }] m=l where, as before, D i = 15 15 2 E x. m=l ~m 15 2 (E x. ) m=l ~m ~m 1m . (4.14 ) 55 EB estimates for ~., 1 2 and ~. 1 are computed, by applying ~. 1 (3.65) and (3.66) to model in (4.1), as follows: 15 - f N 2 ll(c.,a.,b.,s.) Z x. J " 1 1 1 1 m=l 1m ~ ~1' = b. + lID f (4.15) ~ ~ b S )[- f N,1 ,11(c.,a.,b.,S.) Z x. . N, 13 ( c., 1 1 1 1 m=l 1m 1 1 a., 1 1., 1. 1 + 15 f N 2 11(c.,a.,b.,S.)J " 1 111 (4.16) ?2 S.f 11(c.,a.,b.,S.) ~. = 1 N, 1 1 1 1) 1 Ilf , 13(c.,a.,b.,s. 1 l 1 1 N (4.17) where S. 1 = Us.12 , 15 2 15 2 D. = 15 Z x. - (Z x. ) m=l 1m m=l 1m 1 and f N, 13(c.,a.,b.,S.), 1 1 1 1 f N, 11(c.,a.,b.,S.) 1 111 , f N" 1 11(c.,a.,b.,S.), 1 1 1 1 f N,2,11(c.,a.,b.,S.) 1 1 1 1 are as defined below, by applying (3.62), (3.63) and (3.64) on the model in (4.1). Let 1 V N 2 . 1 1= 1 1 N -N[ Z c. - -N( Z c.) . 1 1= 1 2 l J 56 1 6 i1 =N 10 2 si - 15 r: D.~ m=1 x 2 im S* - 13 i - 15 Si ' Then, f N, 1) (c i , ai' bi' Si) = ii t + c. -c sine C ~6 a. -a k]sin[ b -b ~1 S._S~)2 f 1 11(c.,a.,b.,S.) = -K. ~ ~ ~ ~ N, ~ S -s 0 ~6.k]sin[ ~6 k]sin[~] sine ~ ~l O S.-Sf [~ '4 ~J i2 '4 2 (4.18 ) 57 (4.19) 2 f N" 8.) = 1 lie c . , a. , b . , l l J. J. 1 K l: i vi 1 N + r: k=1 ~i a.-a k +5· 1 b.-b k Sin[~Jsin[ l25. l JSin[~J ~ c.-ck II c.-ck . l [26 J J.2 a.-a l k +5·l 1 [ 2o J i1 b.-b l k [--25----] i2 2 8._8*0 sin[F] 8.-8* [l k J '4 1 - --l:--[f N' V il 11(c.,a.,b.,8.)J l J. J. J. (4.20) f N" 2 11(c.,a.,b.,8.) J. J. J. J. =K 1 l: i V i2 8'-8~02 sin[.5]sin[ ~ l] ( 8. [.5J -S~ [~ l] 58 (4.21) a., In programming the simulation, ~ 2? ~ s., ex., t3;..... b., ~ ~ are ~ computed by (4.11), (4.12), (4.14), (4.15), (4.16) and (4.17), respectively, so that we do not have to generate the values of y .. -~ 4.4 Results The ratios of the average squared errors of EB to OLS estimators, denoted by R, ex RR ..... and R0- , as described in Section 4.1 and 4.2 were first computed for cases 1, 2, and 3 (see Table 4.1). The results are given be low in Tab le 4.2. Table 4.2 Ratio of the average squared error of EB to 013 estimators for cases 1, 2 and 3 a Case 1 N 2 3 4 5 10 15 20 30 40 R R ·943 ·905 .867 .832 ·719 .674 .628 .587 ·553 ·913 .895 .862 .803 .664 .634 .600 .540 ·513 ex t3 Case 2 R 0- 1.'488 1.428 1.359 1.318 1. 222 1.147 1.107 1. 043 1.013 R ex ·939 .884 .846 .817 ·728 .681 .642 .584 ·557 Case 3 R R t3 ·925 .872 .826 .808 ·705 .660 .621 ·569 .544 0- 1.430 1. 313 1. 277 1. 265 1. 220 1.183 10139 10075 1. 032 aCases 1, 2, and 3 correspond to Table 4.1. R ex ·953 .867 .861 .826 ·729 .678 .625 ·577 ·546 R t3 ·944 .881 .860 .819 ·721 .674 .623 ·575 .547 R 0- 1.636 1.355 1.318 1.306 1.122 1. 097 1. 069 1.026 ·976 59 From Table 4.2, we see that the ratios smaller than 1 and are always decreasing as means that EB estimators for a and ~ and R a R~ are all N becomes larger. This for cases 1, 2, and 3 are better than the OL6 in the average squared error sense. More improvements are gained for EB over OL8 estimators, the larger the number of experiments a and ~ (N). are notable, for The improvements for EB estimators for N = 10 or larger the gain is more than Recall that cases 1, 2, and 3 (see Table 4.1) have different shape of prior densities, but their respective means and variances are the same. It seems that the shape of prior densities of does not affect much the ratios of R , and a ~ respective means and variances are the same. Note from Table 4.2, however, the ratio than 1, except in case 3 for that EB estimator for the ratio R(J' (J'2 N R(J' R(J' c, , a, and ~ given their is always greater = 40 where R(J' = .976 , which means is worse than the OL8 for is also decreasing as N ~ 40 , but, N becomes larger. What so that the follows is an attempt to improve the EB estimator for improvement will be gained for small values of N, say for N around 10 or even less. The high ratio of R(J' in Table 4.2 (for caused by the large variance of ~2 (J' , or both. ~2 (J' N ~ 40 ) might be , or, the large marginal bias of An estimate for the marginal bias for ~2 (J' , if is known, is computed as (4.22) 60 Using (4.22) with r = 100 , the marginal biases of ~2 cr for cases 1, 2 and 3 were computed and the results are given in Table 4.3 below. Table 4.3 Estimates for the marginal bias of and 3 ~2 cr for cases 1, 2 N Case 1 Case 2 Case 3 2 3 4 5 10 15 20 30 40 6.10 7.4 4 8·50 7·60 9·11 9·12 B.50 8·52 7·92 5·44 6.87 6.24 7·58 7·67 8.25 8.15 7·99 7·29 5·43 6.66 6·50 6.29 4·94 6·77 7· 03 6.31 5·72 The results in Table 4.3 reveal to us that the direction of the marginal bias for N ~ 40 is positive. Therefore, correcting ~2 cr. l of (3.66) by a constant positive mUltiple factor of less than one will simultaneously reduce the variance as well as the marginal bias of ~2 cr , for N ~ 40 . Several positive correction factors less than one were examined in the study, namely, .95, .90, .85, .80 and ·75· selected since estimates of the marginal bias of around 10% to 15% of 1 r N ~2 i=l l -- E [E cr. J Nr k=l k These numbers were ~2 cr in Table 4.3 are 61 ~2 By applying the above correction factors for cr , the Ratio R cr was recomputed for cases 1, 2 and 3 and the results are given in Table 4.4. Observing the values of R cr of .80 gives a good result where ~2 From now on, .80 cri p = 15 R is less than one for cr ~2 (4.1), instead of cr. ~ that this correction factor of T 4.4, a correction factor will be used as an EB estimator for simulation of model where in Table ~2 cr as given in N 2 ~ 3 . cr i for the (4.17). Note is recommended only for model (4.1), (the number of observations in the ith experiment) and = 2 (the number of independent variables including the dummy variable, which is the same as the number of regression coefficients including the intercept). Remarks: ~2 Note that the correction factor of cr of .80 (Table 4.4) was chosen since it gave the smallest ratio Rcr (among all correction factors considered) for cases 1, 2 and 3. However, this 2 ~2 corrected EB estimator for cr , i'~" .80 cr , seems to give a downward marginal bias as a trade off for small variance. .80 ~2 cr for cases marginal biases of Reca11 . ag~n, th a t th e Bayes es t·~mat or f or 2 cr 2(c.,b.,S.) ~ -~ ~ T-pf T (ci,b.,S.) ~ ~ N, -p 2 , denoted by Acr2, cr ~2 cr (EB , which is a two-stage Bayes estimator) must be due to the estimation of the multivariate densities f 3 are given in Table Therefore, the marginal bias for is marginally unbiased. estimator for 1, 2 and Estimates of of and f T-p (c.,b.,S.) ~ -~ ~ and (3.62) and (3.63) whose estimates are denoted by f T 2(c.,b.,S.). ~ -~ ~ N, -p- These multivariate densities and their estimates, as indicated by the subscript of depend on the degrees of freedom of S, i.~., f, (T-p), therefore, we 62 Table 4.4 N The ratio R for cases 1, 2 and 3 after applying ~ ~2 several correction factors (C. F.) for ~ C. F. =1. 00 C.F~=·95 C.F·=·90 C.F.=.85 c.F.=.80 C.F·=·75 1.091 ·987 .871 .848 ·761 ·717 .692 .655 .642 1. 045 ·943 .814 .801 ' ·709 .668 .653 .627 .620 1. 041 ·951 .809 .810 ·707 .668 .664 .649 .649 1. 035 .898 .874 .850 .800 ·751 ·710 .677 .662 ·979 .839 • 8~7 ·791 ·748 .697 .661 .639 .631 .961 .821 .822 ·774 ·739 .688 .660 .647 .647 1.286 ·937 .899 .885 .807 ·741 ·701 .675 .657 1. 250 .899 .862 .842 ·798 ·718 .675 .656 .646 1. 254 0911 .876 .849 .837 .742 .698 .686 .685 Case 1 2 3 4 5 10 15 20 30 40 1. 488 1.428 1.359 1.318 1.222 1.147 1.107 1. 043 1. 013 1·313 1. 229 1.144 . 1.106 1.018 ·956 ·919 .863 .839 1.181 1.082 ·981 ·950 .864 .812 ·781 .734 ·715 Case 2 2 3 4 5 10 15 20 30 40 1. 430 1.312 1.277 1.265 1. 220 1.183 1.139 1.075 1. 032 1. 261 1.134 1.093 1. 093 1.036 ·994 ·949 .895 .862 1.129 ·996 .962 ·951 .896 .850 .806 ·763 ·739 Case 3 2 3 4 5 10 15 20 30 40 1.639 1.355 1.318 1.306 1.122 1. 097 1.069 1. 026 ·976 1.479 1.165 1.127 1.117 ·969 ·931 .898 .860 .821 1.362 1. 025 .988 .976 .864 .813 ·775 .743 ·715 63 Estimates for the marginal bias of 2 and 3 Table 4.5 N 2 3 4 5 10 15 20 30 40 ~2 (j for cases 1, Case 1 Case 2 Case 3 -5·12 -4.05 -3.20 -3·92 -2·71 -2·71 -3·20 -3.18 -3.66 -5.64 -4.51 -5·01 -3.94 -3.86 -3.40 -3.48 -3.60 -4.17 -9·66 -4.()8 -4.80 -4.97 -6.05 -4.58 -4.38 -4·95 -5·42 simulation for case 1 by varying ~2 (T-p). A (T-p) -- in this case by varying T can expect that the marginal bias of only and fixed .80 (j also depends on p = 2 -- was conducted with N = 10 Computing the average of ~2 (j (j 1 r = 200 • as A :2 and N r = -- t [t ~2 (j] Nr k=l i=l i k and the estimate of the marginal bias of ~2 (j as given in (4.22), we obtain the results as presented in Table 4.6. From colums (1) and (3) of Table 4.6, we see that the marginal bias of ~2 (j is inversely related to the number of degrees of freedom (T-p) , which suggests us to regress column:(3) on the inverse of column (1). Regressing the marginal bias of A .... 1 -2 (j T-p gives the following relation: ~2 (j on 64 Table 4.6 The average of EB estimates for ~2 (uncorrected) and estimates of its marginal bias for case 1, where p = 2 and N = 10 (1) (2) (3 ) (4) T-p -2 ~ Bias (~2) 1.77 -2 --~ T-p 7 9 11 65· 85 62.01 62.59 58.10 55.46 54.01 54.31 55·83 55.44 15· 85 12.01 12·59 8.10 5.46 4.01 4.31 5.83 5.44 16.66 12.19 10.07 7·91 6.55 5.63 ... A- 13 15 17 19 21 23 A A- 5.06 4·71 4.27 AA- 1·77 ~2 T-p The values of For removing the m~rginal bias ~2 EB estimator for ~ ~2 Note that for of ~2, by relation (4.23), the T-E-l.77 , i.~., the corrected T-p is given by T-~-1.77 &2 . correction factor for comes are given in column (4) of Table 4.6. should be -p T = 15 and p = 2 , this correction factor be- .86. The discrepancy of the new correction factor (.86) to the old one (.80) can be explained as follows. The correction factor of .86 will almost remove the marginal bias but give a higher variance of EB estimator for ~2 compared to that with correction factor of .80. In general, the correction factor ~2 ~ such that the margin~l an EB estimator for T-;p-l·77 T-p can be used for correcting bias is mostly removed, however to obtain ~2 with a lower ratio R , the correction factor must be slightly smaller. ~ End of remarks. Applying the correction factQr of .80 for ~2 ~ , the ratio of the average squared errQr of EB to 018 estimators, denoted by R , a R~ and R~, were computed for cases ~, 5, 6, 7, 8 and 9 (See Table 4.1) as outlined in Section 4.1 and 4.2 and the results are given in Table 4·7· Table 4.7 The ratio of the average squared error of EB to OLS estimators for cases 4, 5, 6, 7, 8 and 9a Case 4 01 ~ ·952 ·920 .888 .863 ·730 .679 .621 .577 .537 ·707 .607 ·703 .508 .453 .435 .435 .316 .314 N R 2 3 4 5 10 15 20 30 40 R ~ 1.040 ·950 .820 .807 ·721 .680 .664 .637 .630 ·958 ·921 .851 .807 ·743 .691 .647 .604 ·570 10 15 20 30 40 01 ~ .978 ·979 ·971 ·975 ·943 ·927 .918 .896 .885 ·958 ·964 .964 ·957 ·953 ·928 ·930 ·903 .894 R R ~ 1.075 .986 .864 .861 ·784 .749 ·736 ·708 .699 Case 8 Case 7 2 3 4 5 Case 6 Case 5 1. 043 ·920 .806 .766 ·750 ·709 .637 .580 .543 •!t26 ·946 .804 ·793 ·705 .671 .658 .633 .626 ·943 ·905 .867 .832 ·719 .674 .628 .587 .553 01 ~ .898 .826 ·764 ·715 ·557 .499 .455 .403 .375 .683 ·590 .544 .473 .311 .272 .094 .038 .033 R R ~ 1. 042 ·926 .800 ·785 .696 .656 .643 .616 .614 Case 9 ·913 .895 .862 .803 .664 .634 .600 .540 .513 1. 045 ·943 .814 .801 ·709 .668 .653 .627 .620 .871 ·775 ·705 .649 ·527 .471 .428 .384 .359 .872 ·768 ·703 .633 .489 .447 .412 .352 .340 1.024 ·922 ·785 ·773 .680 .644 .632 .607 .604 aSee Table 4.1 for the description of cases 4, 5, 6, 7, 8 and 9· Comparing the values of R OI , R~ from Table 4.2 wi th R~ taken from Table 4.4 (C.F. = .80) for cases 1, 2 and 3 to those values of R 01, FL ~ and R ~ in Table 4.7 for cases 4, 5, 6, 7, 8 and 9, several results on the next page can be observed. 66 (a) The shape of the prior densities of and ~ ~ 2 R , a does not affect much the ratios and R~, ~ c, a, given their respective means and variances are the same (comparisons of cases 1, 2 and 3. That is why the shapes of prior densities for other cases were, then, taken to be normal). (b) The higher the variability of c -- which represents the X matrix better EB estimator for affect much the ratios mean of a, c 2 ~,~ among experiments, the ~ R a but it does not and R , given the ~ and the respective prior densities of are the same (comparison of case 1 to case 4). (c) The smaller the absolute value of the mean of c-- which represents the absolute magnitude of the X matrix -- , the better EB estimators for expecially for ratio R~, ~,but a and it does not affect much the given the variance of c and the respective prior densities for same (comparisons of cases 1, 6 and 7). are the The reason for this is that when the magnitude of the elements of the X matrix is small, the magnitude of the elements of (X/x)-l will be large, which in turn makes the variance of an L8 estimator large. That is why the ratio of the average squared error of EB to that of an OLS (which reflects its variance, since OLS estimators are unbiased) estimators becomes smaller. (d) The ratios and R CT seem to be un- affected by the change of the prior means of and , given the variances of 01 the respective prior densities for c 01 ~ , ~ and and CT 2 are the same (comparison of case 1 to case 8 ). {~) The smaller the variances of better EB estimators for 01 01 and and ~ respective prior densities of the same. c ~.~., , and , the given the CT 2 are The best EB estimators will be obtained when the variances of zero, ~ when Of and ~ 01 and are ~ are actually fixed (comparisons of cases 1, 5 and 9). Results of the simulation for the generated values of (represented the 2 (OI,~,CT) X mat~ix), the generated parameter values of with their respective 018 and EB estimates, taken from one replication of cases 1 through 9 for the special case of T = 15 c , are presented in the Appendix. N = 10 , Since the true parameter values are known, we can see from the Appendix the direction as well as the magnitude of improvement of each of EB over its corresponding OLS estimate. 68 5. APPLICATIONS TO SOME PROBLEMS IN ECONOMETRICS In economic studies, the nature of experiments described by Models A*, B*, C* and D* of Section 3.4 might be relationships of micro-unit variables such as firms, farms, households, etc., observed over a period of time, or it might be relationships of regions with observations over a time series or over several sub-units in each region. The estimate of the U.S. consumption function (a distributed lag model), for example, as obtained by Griliches et al. Zellner and Geisel (1962) or by (1970) might still be improved if we incorporate the data from several other countries. Similarly, if the constancy of regression coefficients over a long time period is doubtful, we might break down this series into several sub-series and assume that the regression coefficients are constant within a sub-series but they are random across sub-series. By assumptions as described in Section 3.4, empirical Bayes estimates for regression coefficients in each sub-series will be better than ordinary least squares. Two examples will be given on the next pages to illustrate the applications of empirical Bayes approach in economic studies. 5.1 Investment Analysis Grunfeld (1958) constructed a multiple linear regression model to study the determinants of corporate investments. variable I, the independent variables model are described as follows: F_ l and The dependent C_ l used in the I ~ gross investment additions to plant and ~ equipment plus maintenance and repairs in millions of dollars, deflated by F_ l ~ value of the firm = price l ; of common and preferred shares at December price of December P 31 (or average 31 and January 31 of the following year) times number of common and preferred shares outstanding plus total book value of debt at December dollars deflated by C_ l 31 in million P , lagged one year; 2 = the stock of plant and equipment = accumulated sum of net additions to plant and equipment deflated by P allowance deflated by P l 3 minus depreciation , lagged one year; where P l = implicit price deflator of producers durable equipment (base P 2 1947); = implicit price deflator of G.N.P. (base 1947); P3. = depreciation expense deflator = ten years moving average of wholesale price index of metals and metal products (base 1947). The description of the above variables and the time series data (1935-54) were also given in an article by Boot and de Wit (1960). 70 By assuming Model D* of Section 3.4 for these investment data, where we have = 10, N 20 T and p = 3 , EB estimates for regression coefficients and error variance are computed by using (3.56) and (3.57). The results, as well as the OLS estimates, are presented in Table 5.1 below. Table 5.1 Corporation 1 2 3 4 5 6 7 8 9 10 OLS and EB estimates for regression coefficients and error variances for the ten corporations a OLS estimates Inter- Coeff. Coeff. cept. of F_ of C_ l l -149.78 -- 49·20 9·96 - 6.19 - 22·71 8.69 4.50 0·51 7·72 0.16 .1193 .1749 .0266 .0779 .1624 .1315 .0875 .0529 .0753 .0046 .3714 .3896 .1517 .3157 .0031 .0854 .1238 .0924 .0821 .4347 s 2 8423.87 9299.60 777·45 176.32 82.17 65.33 88.67 104.31 82.83 1.18 Intercept EBestimates Coeff. Coeff. of F_ l of C_ l .1205 .1638 .0310 .0786 22.26 .1646 - 7·25 .1172 - 3·01 .0854 - 3.17 .0579 - 7.94 .0725 0.03 .0071 -157·28 - 16.81 - 15.60 - 6.40 .3670 .3572 .1453 .3128 .0028 .1291 .1198 .0829 .0859 .4292 .. 2 cr 8311.60 9134·70 777·32 176.32 82.17 65.33 88.69 104.33 82.84 1.18 ~he ten corporations are: (1) General Motors, (2) U.S. Steel, (3) General Electric, (4) Chrysler, (5) Atlantic Refineries, (6) I.B.M., (7) Union Oil, (8) Westinghouse, (9) Goodyear, and (10) Diamond Match. On the other hand, there is another different approach due to Zellner (1962) for analyzing these data of ten corporations, based on the following model: y. -]. where = X.". ].-]. + E. -]. , i = 1, 2, ••. , N (5·1) 71 (a) ~i (i = 1, 2, (b) x.1 (i = 1, 2, ... , N) ... , N) is fixed; is non-stochastic with full rank; (c) E (e. ) -J. = -0 for i = 1, 2, ... , N , i, j = 1, 2, E(e.en = l-J eJ •• 1 lJ for all and ... , N . Applying Zellner's two-stage least squares on the above data from ten corporations, we obtain estimates for regression coefficients as given in Table 5.2 below. Table 5.2 Zellner's two-stage LS estimates for regression coefficients for the ten corporations Corporations Intercept 1. 2. 3. 4. 5· 6. 7· 8. 9· 10. -133·00 - 18.60 - 11.20 2.45 26.50 5.56 9.67 4.11 2.58 2.20 G.M. U.S. Steel G.E. Chrysler Atl. Ref. I. B.M. U. Oil West. G. Year D. Match. Coefficient of F_ l .1130 .1700 .0332 .0672 ·1310 .1310 .1120 .0525 .0760 -.0181 Coefficient of C_ l .3860 .3200 .1240 .3060 .0102 .0571 .1280 .0412 .0641 .3650 Zellner (1963) also claimed that only if the absolute value of the true correlation coefficient of error components from two different experiments, ~.~., 72 is in the neighborhood of zero and/or (T-2p) is small, then the OLS estimators are slightly more efficient than the two-stage least squares; more efficiency of the two-stage LS will be gained the higher and/or (T-2p). XiX. 2 J most when Ipl The efficiency over the OLS estimates will be gained = ~0 for all i Fj . EB estimators, therefore, are not comparable to Zellner's twostage 18 since they are based on different assumptions about error components; EB estimators are superior over the 018 in the case where p =0 and N (the number of experiments) is large enough while Zellner's two-stage LS does not depend on efficient than the 018 when (~5), N and it is more p is not close to zero. At this point, however, a study can be conducted to investigate the robustness of EB estimators when p F0 and compare them to Zellner's two-stage least squares estimators. 5.2 Elasticities of Substitution 018 estimates for elasticities of substitution in the demands of the selected importers between flue-cured from the United States and tobacco from other sources were given by Capel (1966) using annual data for 1955-1964. He constructed the following linearized model. where A,B = quantities of United States flue-cured and other tobacco imported by a country in a particular year, respectively; 73 Pa'Pb = their prices; A = elasticity k = intercept; fJ. = random of substitution between A and B; error. By assuming Model D* of Section 3.4, where we have T = 10 and N=9 , p = 2 , EB estimates for the elasticities of substitution for the nine countries are presented in Table 5.3, along with their OLS estimates. Table 5.3 018 and EB estimates for elasticities of substitution in demand between flue-cured tobacco from the United States and tobacco from other countries a i~ort Importing Countries 018 estimates Uni ted Kingdom West Germany Netherlands Belgium Japan Egypt Ireland Denmark Sweden ~he data were taken from Capel -2.47 -3.57 -1.47 -5·15 -0.81 -2.19 -0·99 -1.04 -0.26 EB estimates -2.463 -3.581 -l. 588 -4.257 -0.825 -1.889 -l. 002 -1.805 -0• .l38 (1966), which sources are: Egypt: Uni ted Arab Republic tobacco report, unpublished report, Foreign Agricultural Service, Cairo, Egypt, 1965; other countries: United Nations (1956-1965). 6. SUMMARY, CONCLUSIONS AND RECOMMENDATIONS 6. 1 Summary Consider the following mUltiple linear regression model: E. f3. + -J. -J. 'D<p p<l 'lXl X. J. , i = 1, 2, ••• , N where (a) .§.l' .§.2' ..., .§.N are independent and 2 is distributed as N(O, cr. I) - (b) 2 f3 i ,cr. - J. E. -J. J. are fixed parameters and X.J. is a non-stochastic full rank matrix. The ordinary least squares estimator for f3. , denoted by -J. is the best linear unbiased and 2 1 s.J. = T-p (y.J. - X.J.-J. b. ) I (;Z. - X. b. ) J. J.-J. is the minimum variance unbiased estimator for By assuming that the coefficient vector residuals 2 cr or the 2 cr. • J. ~, or the variance of X matrix or some combinations of them to be random across units, but fixed within units, four variants of mUltiple linear regression models were constructed. When the ~ priori distribution is assumed known, the Bayes estimation for each model was 75 presented. ~ However, when no assumption is made about the form of the priori distribution, a two-stage Bayes estimator which is called the empirical Bayes estimator -- was developed for each model. A comparison of empirical Bayes to the ordinary least squares was performed by means of a simulation study using a mUltiple linear regression model; and,- finally, some applications of the method in econometrics were pointed out. It was found that for all cases stUdied, the empirical Bayes estimators are always better than the ordinary least squares, in the sense that their corresponding average squared errors are smaller, no matter what was the true form of the unknown joint prior density of x, ~ 6.2 and 0- 2 Conclusions and Recommendations Relationships, in this case multiple linear regression, of micro- unit variables such as firms, farms, households, regions, etc., might be considered as a random sample from a larger popUlation. the case, the independent variables and ~arameters If this is can be treated as random variables across micro-units, but they are fixed as a realization of a random variable over T successive observations within a micro-unit. Empirical Bayes estimators for ~ and 2 0- , for the simple linear regression, are always better than the 018 in the sense that their average squared errors are smaller. The ratio of the average squared error for EB to OLS estimators is practically not affected by different forms of the joint prior density of X, a, ~, and (J' 2 given their mean and variance are the same. More improvement for EB estimators will be gained over the OLS, the smaller the prior variance of a and/or ~,~.~., the closer the random regression model to a fixed (parameter) regression model. Similarly, the higher the variability of the X matrix across micro- uni ts, the more improvement gained for EB over the OLS estimators. This is the case often found in practice, when we are dealing with different sizes of micro-units such as firms, farms, regions, countries, etc., where most likely the variability of independent variables across micro-units is high. Here, the EB estimation is mostly recommended. EB estimators are better than the 018, when the error components across, as well as within, the micro-units are independent. The robustness of EB estimator when the error components are correlated need to be investigated further, in this case the appropriate comparison is to the Zellner's two-stage least squares instead of the ordinary least squares estimators. Different types of functions other than the one used in (3.33) or even different multivariate density estimators other than that of Cacoullus might be investigated to obtain better EB estimators than those outlined in this thesis. Throughout this thesis, only the point estimation is discussed. The EB interval estimation as well as testing of hypotheses for random regression models are recommended for future research. 77 7. 1. REFERENCES Boot, J.C.S. and S.M. de Wit (1960). Investment demand; an empirical contribution. International Economic Review, 1:3-30. , . 2. Cacoullus, T. (1966). Estimation of a multivariate density. Inst. Statist. Math. (Tokyo, Japan), 18:174-183. 3. Capel, R. E. (1966). An analysis of the export demand for United States flue-cured tobacco. Ph.D. Thesis, Department of Economics, N.C. State University, Raleigh, N.C. Univ. Microfilms, Ann Arbor, Mich. 4. Chetty, V. K. (1968). Pooling of time series and cross section data. Econometrica, 36:279-290. 5· Chetty, V. K. (1971). Estimation of Solow's distributed lag models. Econometrica, 39:99-117. 6. Clemmer, B.A. and R. S. Krutchkoff (1968). The.,use of empirical Bayes estimators in a linear regression model. Biometrika, Ann. 55: 525-534. 7. Graybill, F.A. (1961). An Introduction to Statistical Linear Models. Vol. 1. McGraw-Hill Book Co., Inc., New York, N.Y. 8. Griliches, z., G. S. Maddala, R. Lucas and N. Wallace (1962). Notes on estimated aggregate quarterly consumption functions. Econometrica, 30:491-500. 9. Grunfeld, Y. (1958). The determinants of corporate investment. Unpublished Ph.D. Thesis, Department of Economics, Univ. of Chicago, Chicago, Ill. Univ. Microfilms, Ann Arbor, Mich. 10. Jeffreys, H. (1961). Theory of Probability. Clarendon Press, Oxford, England. 11. Kendall, M. G. and A. Stuart (1961). The Advanced Theory of Statistics. Vol. 2. Hafner Publishing Co., New York, N.Y. 12. Klein, L. R. (1953). A Textbook of Econometrics. Peterson and Co., Evanston, IL~nois. 13. Krutchkoff, R. G. (1967). A supplementary sample non-parametric empirical Bayes approach to some statistical decision problems. Biometrika, 54:451-458. Third Edition. Row, 14. Kuh, E. (1959). The validity of cross-sectionally estimated behavior equations in time series applications. Econometrica, 27:197-214. 15. Maritz, J. S. (1969). Empirical Bayes estimation for the Poisson distribution. Biometrika, 56:349-349. 16. Martz, H. F. and R. S. Krutchkoff (1969). Empirical Bayes estimators in a multiple linear regression model. Biometrika, 56:367-374. 17. Nerlove, M. (1965). Estimation and Identification of CobbDouglas Production Functions. Rand McNally and Company, Chicago, Ill. 18. Robbins, H. (1955). An empirical Bayes approach to statistics. Proc. Third Berkeley Symposium on Math. Stat. and Prob., 1:157, Berkeley, Calif. 19. Robbins, H. (1964). The empirical Bayes approach to statistical decision problems. Annals of Math. Stat., 35:1-20. 20. Rutherford, J.R. and R. S. Krutchkoff (1969). optimality of empirical Bayes estimators. € -asymptoti c Biometrika, 56 :221-223. 21. Samuel, E. (1963). An empirical Bayes approach to testing certain parametric hypotheses. Annals of Math. Stat. 34 :1.370-1.385· 22. Swamy, P.A.V.B. (1970). Efficient inference in a random coefficient regression model. Econometrica, 38:311-323. 23. Theil, H. (1971). Principles of Econometrics. Sons, Inc., New York, N. Y. 24. Tiao, G.C. and A. Zellner (1965). Bayes' theorem and the use of prior knowledge in regression analysis. Biometrika, John Wiley and 51:219-230. 25. Zellner, A. (1962). An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. J. Amer. Stat. Assoc., 57:348-368. 26. Zellner, A. (1963). Estimators for seemingly unrelated regression equations; some exact finite sample results. Amer. Stat. Assoc., 58:977-992. J. 79 * 27. Zellner, A. (1966). On the aggregation problem: A new approach to a troublesome problem. Report 6628, Center for Mathematical Studies in Business and Economics, University of Chicago, Chicago, Ill. 28. Zellner, A. and M. S. Geisel (1970). Analysis of distributed lag models with applications to consumption function estimation. Econometrica, 38:865-888. 29. Zellner, A. and C. J. Park (1965). Bayesian analysis of a class distributed lag models. Econometric Annual of the Indian Journal, 13:432-444. 80 e, 8. APPENDIX e e e Table 8.1 Generated values of c,~,~, ~2 and their OLS and EB estimates, taken from one replication, where N = 10, T = 15, for cases 1 through 9 OLS estimates Experiments 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 c ~ ~ 2 a ~ b EB estimates s 2 ~ ex "" ~ ,,2 "~ a 2 Case 1: c ~ N(10,9), ~ ~ N(10,9), ~ ~ N(5,2.25), ~ ~ N(50,225) 11.60 5.84 9.45 13.31 7·29 7·55 6.85 -9.67 8.40 11.37 10.41 10.83 Case 2: c ~ L(10,9), ~ ~ L(10,9) ~ ~ 10·52 9·12 6.98 10·93 10.36 10·73 8.38 8.42 14.56 18.65 9.36 12·57 7.61 8.25 12·51 9.84 7·17 6.61 13.27 7·52 8.~2 1l·50 5.54 9·41 12·93 10·52 9·36 5.89 3.39 3·90 7·47 7· 54 5·77 4.67 2.59 .3.03 6.80 2.04 4.88 7.41 4.79 7·71 6.68 5.45 4.18 3·70 9·02 7·13 68.22 69.09 46.58 33.89 36.82 48.06 58.58 36.25 51.66 26.06 58.14 69· 78 55·58 36·78 33·90 59.46 43.59 60.19 61.12 62.14 13.84 2.02 11·55 15·55 -1·57 9·22 20.03 0.44 18.61 10.88 3.05 5.54 7·15 7.34 6.86 4.42 2.04 4.10 5·55 1.70 2 L(5,2.25), ~ ~ 12.02 25.67 9·98 5·03 23.18 12.22 3.54 0.22 11. 23 -3·97 4.70 6.04 4.46 7·91 5·76 5.30 4.43 4.22 9·11 7· 79 97.56 60.50 41. 95 44.84 45.28 72.41 63.35 47·54 53·90 29·77 3.94 12.36 15.36 2.35 8.53 14.63 1. 01 16.09 10·95 3.19 5.22 7.06 7·35 6.33 4.50 2.80 4.05 5.83 1.10 77·13 55·40 40.22 41.81 43.68 63.35 60.11 48·57 51.32 30.26 11. 29 24.20 9·43 4.13 22·75 12.13 4.04 1.68 12.08 -3.41 4.76 6.17 4.53 7·99 5·79 5.30 4.38 4.06 9.04 7.76 25·25 56.01 25·10 33.40 32·57 79·60 66.16 20.36 46.62 22·79 12.12 L(50,225) 2Y.·94 59.66 24.88 35.22 32.34 94.24 72.40 20.17 50.08 24.05 ():) ~ e e e Table 8.1. (Continued) OLB estimares Experiments c Case 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 3: c ~ US(10,9), 11.85 6.54 13·52 13.76 13.47 11.65 9·27 6.50 7·19 12·,79 6·73 6.47 13.66 8.15 7· 53 -6.79 13· 76 13.32 6·96 6.82 Case t3 Of 4: 8.33 -0·55 14.24 4.07 13·79 -8·99 31.10 -22.04 24.52 9·06 C N ~ Of 6.63 3.09 6.84 3.82 6.85 3.26 3.86 3.24 3·33 6.85 N(10,144), 9·90 7.66 13.67 8·07 11.10 9·33 9·20 7.86 7·37 12.16 0" 2 US(10,9), 65.25 67.75 61.68 64.12 63.67 66.52 65.96 38.18 35·72 36.25 Of 5.04 2·92 6.01 4.13 5·25 7·14 2.84 4.97 6.42 1. 83·· ~ b a t3 ~ US(5,2.25), EB estimates s 2 "- "Of 50.08 71.16 45·96 36.61 64·76 65·77 50.20 52.34 76.11 37· 97 4·96 14.48 12·94 -2·70 10·55 3.32 -5·15 6.28 -0.69 18;21 5.63 6.07 6.81 5·29 6.88 3.17 4·93 6.93 1.31 ~2 0" 2 ~ US (50, 225 ) 6.06 2.64 6·73 4·99 6.03 3.19 4.25 3.66 4.02 8.13 16~65 "- t3 0" 75.36 15·23 61. 29 8.49 - 83.70 17·68 72.13 7.49 33.84 17.68 68.26 7·51 60.36 3·51 2.09 19·76 0.10 20.38 -1·75 2.48 4.24 26.69 N(10,9), ~ ~ N(5,2.25), 0"2 ~ N(50,225) 14·93 12·91 16.99 6.51 18·96 10.31 3.28 2.94 "- 39.69 60.28 34.60 36.66 94.68 72.25 29.84 24.62 149,70 30.23 4.35 9.54 11.68 0·52 6.63 0.04 -5·13 6.56 1.31 17·91 5·99 3·30 6.67 4.87 6.18 3·59 4.24 3·72 3·77 7·88 67,61 57·53 74.44 65.37 35.47 62.48 56.08 19·74 20.64 26.09 5·70 7·99 6.15 6.04 5.56 6.52 3.16 4·94 6.84 1.33 39·10 57·38 34.05 35·83 87"27 68.54 28.20 23.82 129.68 28.64 OJ f\) e e Table 8.1 e (Continued) OL8 estimates Experiments 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 c Case 5: c ,..,. N(lO,9), 9·74 17·83 10·55 9·92 9·64 9·99 11. 50 12.69 11.67 8·74 l7·05 11.68 -3.66 27· 93 -6.21 23.83 8.00 l8·93 7·88 Case 6: c ,..,. N(0,9), -4.03 0.85 1. 84 -3.08 1.35 -1. 53 -3· 53 -3· 92 -1. 80 6.46 9·53 13.32 8.74 15.06 ~8·98 6.34 13.87 12.49 13.30 13· 25 9·15 2 f' a a (J" Of "'" -0.14 1. 97 10.12 18.68 13.09 3.38 loll -0·37 5·75 2·79 N(10, 144), 74.82 70.89 59·95 29·09 60.39 22.64 25.25 44.38 41.25 62.61 ex ,.., N(1O,9), l.19 4.64 7·29 6.05 3·00 6·30 6.35 5·45 3.14 5·01 36.24 23.20 29· 72 26.33 43.83 45.05 33·76 70.80 47·35 40.46 EB estimates s b f' "'" N(5,36 ), 24.16 ll.14 13.29 1. 40 32.43 -5· 75 25·03 12.27 21+.34 4.c23 '-0.80 2.17 10.24 18.21 12·78 3.17 0.96 -0·75 5.36 3.45 t5 ,.., N(5, 2.25), 12.05 25·10 4.26 11.34 14.09 17· 98 19·36 11. 74 8.83 10.11 1. 74 -8.61 9·94 7·13 -2.64 7·77 7·45 4·94 1.65 5·ll (J" 2 ,.., 2 ""Of ""- f' ~2 (J" N(50,225) 44.89 21·90 68.81 10.80 12.80 68.98 22.'30 1.38 56.51 32.48 -4.48 36.30 20.49 24·93 13.28 39· 75 24.28 19·48 84.88 4.78 2 ,..,. N(50,225) (J" -0·58 2.18 10.29 18.20 12·77 3· 05 0·96 -0.82 5.36 3.38 41. 70 63.23 63.45 21. 05 52·75 37.16 19.40 37·51 18.41 74.14 12.12 20.17 8.16 9·89 11. 73 14.45 15.68 ll.41 9·68 9·56 1. 77 -3.00 7·87 6.66 -0·95 5·54 6.45 4.86 2.11 5·19 32.03 30·98 31. 54 35.86 46.46 46.81 42.16 45.00 45·55 39.66 29·32 29·05 20·51 34.28 56.57 59·20 44.25 52.31 53.13 41. 97 OJ \jJ e e e Table 8.1 (Continued) 018 estimates Experiments c Case 7: I i 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 -9·05 -13· 83 ~14.76 -10.85 -8.02 -14.29 -13.03 -7·25 -11. 96 -9·65 ~ ex C "'" 0- N(-10, 9), 11.36 9·35 11.33 5.89 12.63 3.86 4.68 11.32 6.81 13·96 c "'" N(10,9), 10·95 6.17 5·24 9·16 11. 98 5·71 6·96 12·75 8.04 10.35 1.36 -0.65 1.33 -4.10 2.63 -6.14 -5.32 1.32 -3.19 3·96 66.06 48.53 32.03 47·58 43.10 55·06 62.45 32.87 40.87 38.37 ex ,.,. N(0,9), -2.42 0.65 -1.89 -3· 09 -0·27 1.34 -2.18 -3.37 0·54 1·39 66.06 48.53 32.03 47·58 43.10 55.06 62.45 32.87 40.87 38.37 b a ~ ex "'" N(10, 9), 2·58 5.65 3.10 1. 90 4·73 6.34 2.82 1..63 5·54 6.39 Case 8: 2 "'" N(5,2. 25), 26 5·58 9·32 18.22 13.80 0·98 5·32 21. 94 8.14 8.77 -1. ~ 1. 76 5.43 3·02 3·30 5.04 6.01 2.81 2·76 5· 71 6.10 ,.,. N(0,2.25), -11. 26 -4.42 -0.68 8.22 3.80 -9·02 -4.68 11. 94 -1. 86 -1. 22 -1. 74 1.16 -1. 65 -4·74 -0.48 2.16 -2.18 -4.01 0.29 1.67 EB estimates s 0- "" ex 2 2 ,.., "" ~ ~2 0- N(50, 225) 16.00 -1. 21 47·16 5·52 17·07 9·04 52.80 16.52 12.16 27·98 54.48 -0.07 81. 42 6.39 21. 01 35·20 6·72 26·77 43·97 5·95 2 0- ,... N(50,225) 16.00 47·16 17· 07 52.81 27· 98 54.48 81. 42 32.50 26·77 43·97 -10.25 -3.69 -2.45 6.21 2.45 -6.06 -6.05 12.16 -2·58 -1. 04 1. 77 5·43 3·00 3.15 4.85 5· 93 2.89 2.65 5.60 5·82 15· 55 43.59 17· 95 49.42 27·33 49.18 69.15 34.19 26.64 40.86 -1.83 1. 04 -1.32 -4·53 -0.38 1.65 -1. 98 -4.04 0.37 1.64 15.84 44.00 17·28 48.56 26.38 49·49 68.00 33·90 26.15 41.31 (» +'" - • Table 8.1 (Continued ) QLS estimates Experiments 1 2 3 4 5 6 7 8 9 10 e • c ex 2 t3 Case 9: c --- N(10,9), 16.25 12.14 17.47 9·58 10.02 0.81 12·77 13.61 13.15 13.35 10.0 10.0 10.0 10.0 10.0 10.0 10.0 10.0 10.0 10.0 5·0 5·0 5·0 5·0 5·0 5·0 5·0 5·0 5·0 5·0 ex = 10, t3 68.49 36.95 20·74 45·03 56.02 55· 27 36.01 51. 85 42.52 46.88 ~his is a corrected EB estimate using C.F. b a (J" = 5, cr2 ___ 17· 72 3·52 7·29 12.35 12.30 10.65 8·76 8·53 9·23 3.19 EB estimates 2 "" ex 62.84 46.13 23.10 38.83 26.40 84.10 36.47 27·82 64·77 54.01 13.56 5.86 8.28 11·74 10.39 10.43 8.12 8.20 9·24 6.08 s '" '"t3 ~2 cr N(50, 225) 4.48 5.32 5.26 5·12 4.65 5·28 4,84 4.92 5·06 5.48 4·73 5.1lJ. 5·20 5.18 4.83 5·46 4.88 4·94 5·05 5.26 55·42 43·61 23·52 37·95 26.81 69·60 35·66 27·97 56.60 49·41 = .80, which is slightly biased downward. co \.Jl

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Barizi.; (1973)An assessment and some aplications of the empirical Bayes approach to random regression models." Thesis.